Methods for Exploratory Assessment of Consent-to-Link in a Household Survey

Methods for Exploratory Assessment of Consent-to-Link in a Household Survey Abstract There is increasing interest in linking survey data to administrative records to reduce respondent burden and enhance the amount and quality of information available on sample respondents. In many cases, legal constraints or societal norms require survey organizations to obtain informed consent from sample units before linking survey responses with administrative data. Guiding such efforts is a growing empirical literature examining factors that impact respondents’ consent decisions and the success of linkage attempts, as well as evaluations of potential differences between consenting and non-consenting respondents. This paper outlines a range of options that statistical organizations can consider for evaluation and testing of linked datasets. We apply methods for assessing consent propensity and consent bias to data from the U.S. Consumer Expenditure Survey, and investigate the impacts of demographic, socio-economic, and attitudinal variables on respondents’ consent-to-link propensities. We then analyze potential consent-to-link biases in mean and quantile estimates of several economic variables, by comparing different propensity-adjusted and unadjusted estimates, and by comparing estimates from consenting and non-consenting respondents. We contrast several estimation approaches, and discuss implications of our findings for consent-propensity assessments and for approaches to minimize risks of consent-to-link bias. 1. INTRODUCTION Survey organizations face many challenges in their efforts to produce high-quality survey data. The costs of data collection and the demand for data products are greater than ever, and survey budgets often are under serious strain to meet these demands. Declining survey response rates further complicate cost and data-quality considerations. Given these challenges, survey organizations increasingly are exploring the possibility of linking survey data to administrative records. Combining survey and administrative data on the same sample unit has the potential to reduce the cost, length, and perceived burden of a survey; enrich our understanding of the underlying substantive phenomena; and offer a mechanism for targeted assessments of survey error components. Linking survey data to administrative data sources on the same individual or household requires matching records from one dataset to the other. The efficiency and success of this matching process depends on the variables and linkage strategy used to establish the link. Exact matching techniques are most successful when unique identifying information such as a social security number (SSN) is available, but these techniques can also be effective in the absence of unique identifiers when combinations of other personal variables are compared (e.g., last name, date of birth, street name) (see Herzog, Scheuren, and Winkler 2007 for a review of statistical linkage techniques and related data-cleaning issues). Before any linkage attempt can be made, however, most countries require that survey respondents give their informed consent to link, and consent rates can vary considerably—from as low as 19% to as high as 96% (Sakshaug, Couper, Ofstedal, and Weir 2012). Lower consent rates are potentially a major challenge to wider adoption of record-linkage in statistical agencies because they increase the risk of bias in estimates derived from combined data (to the extent that there are systematic differences in key outcome measures between those who consent to link and those who do not). As interest in and adoption of record-linkage methods have increased, so too have investigations into factors associated with respondents’ consent decisions and their potential impact on consent bias. In general, consent-to-link phenomena can be viewed as a type of incomplete-data problem and, thus, can make use of the broad spectrum of conceptual and methodological tools that have developed for work with incomplete data. Examples include assessment of cognitive and social processes that lead to survey response (or consent to link); modeling and diagnostic tools for estimation and evaluation of related propensity models; and empirical assessment of biases resulting from the incomplete-data pattern. To date, direct examinations of consent bias in estimates derived from combined data are extremely rare in the literature because they require researcher access to administrative records for both consenters and non-consenters (cf. Sakshaug and Kreuter 2012; Kreuter, Sakshaug, and Tourangeau 2016). Most studies of linkage consent therefore have used data from survey respondents—available for both consenters and non-consenters—to identify characteristics correlated with consent propensity. These studies provide an indirect means of assessing the potential risk of consent bias if the underlying consent propensities are related to differences in respondents’ administrative record profiles. Findings from this literature indicate that linkage consent is often associated with respondent demographics (e.g., age, education, income), indicators of survey reluctance (e.g., prior nonresponse in a panel survey), and features of the survey (e.g., wording, placement, and timing of consent requests), but the magnitude and direction of these effects vary across studies (Bates 2005; Dahlhamer and Cox 2007; Sala, Burton, and Knies 2012; Sakshaug, Tutz, and Kreuter 2013). Only very recently have researchers started to develop theory-based hypotheses about the mechanisms of consent decisions, and to incorporate more sophisticated analytic approaches to test these hypotheses (Sala et al. 2012; Sakshaug et al. 2012; Mostafa 2015). Finally, empirical assessment of consent-to-link propensity patterns and related potential consent biases naturally involve a complex set of trade-offs involving (a) the degree to which a given set of test conditions are relevant to current or prospective production conditions; (b) the ability to control applicable design factors and to measure relevant covariates, within the context of those current production conditions; (c) the ability to measure and model specific portions of the complex processes that lead to respondent consent and cooperation in a given setting; and (d) constraints on resources, including both the direct costs of testing consent-to-link options and the indirect costs arising from the potential impact of testing on current survey production. The remainder of this paper considers some aspects of issues (a)–(d), with emphasis on exploratory analyses for one case study that was embedded within a current survey production process. In the next section, we review the literature on consent decisions and consent bias. In section 3, we describe in detail a specific consent-to-link case involving the US Consumer Expenditure Survey (CEQ) and then present consent propensity models and our evaluation methodology. The results of our descriptive and multivariate examinations of consent propensity and assessments of consent bias are presented in section 4. We summarize and discuss the implications of our findings in section 5. Appendix A provides detailed descriptions of the analytic variables. Appendix B presents technical results on the variability of weights used in the CEQ dataset. Appendix C discusses some features of the goodness-of-fit tests used for the models considered in section 4. In addition, an online Appendix D (Supplementary Materials) presents some related conceptual material and numerical results for hypothesis testing that led to the modeling results summarized in section 4. 2. LITERATURE REVIEW 2.1 Factors Affecting Linkage Consent The earliest investigations of consent-to-linkage effects were mostly conducted in epidemiology and health studies that requested patients’ consent to access their medical records (Woolf, Rothemich, Johnson, and Marsland 2000; Dunn, Jordan, Lacey, Shapley, and Jinks 2004; Kho et al. 2009), but more recent studies have assessed consent in general population surveys with linkage requests to an array of administrative data sources (Knies, Burton, and Sala 2012; Sala et al. 2012; Sakshaug et al. 2012, 2013). In this section we summarize findings from this research on the factors that affect consent to data linkage. 2.1.1 Respondent demographics Linkage consent studies largely have focused on respondents’ sociodemographic characteristics, most often age, gender, ethnicity, education, and income (Kho et al. 2009; Fulton 2012). These variables are widely available across surveys, and although they are unlikely to have direct causal impact on most consent decisions, they provide indirect measures of psychosociological factors that may influence those choices. Demographic differences between consenters and non-consenters are common, but the patterns of findings differ across studies. For example, older individuals frequently have been found to be less likely to consent to record linkage than younger people (Dunn et al. 2004; Bates 2005; Dahlhamer and Cox 2007; Huang, Shih, Chang, and Chou 2007; Pascale 2011; Sala et al. 2012; Al Baghal, Knies, and Burton 2014). But some studies have found the opposite effect (Woolf et al. 2000; Beebe et al. 2011) or no age effect (Kho et al. 2009). Males often consent at higher rates than females (Dunn et al. 2004; Bates 2005; Woolf et al. 2000; Sala et al. 2012; Al Baghal et al. 2014), but some studies find no gender effect (Pascale 2011; Sakshaug et al. 2012). Consent propensities for ethnic minorities and non-citizens tend to be lower than for majority groups and citizens (Woolf et al. 2000; Beebe et al. 2011; Al Baghal et al. 2014; Mostafa 2015), although not all studies show these effects (Bates 2005; Kho et al. 2009). Similar inconsistencies are evident across studies for the effects of education and income on consent (Kho et al. 2009; Fulton 2012; Sakshaug et al. 2012; Sala et al. 2012). Other respondent demographic and household characteristics have been examined less frequently (e.g., marital status, employment status, household size, and owner/renter), again with mixed findings (Olson 1999; Jenkins, Cappellari, Lynn, Jackle, and Sala 2006; Al Baghal et al. 2014; Mostafa 2015). 2.1.2 Respondent attitudes Attitudes can have a powerful impact on thought and behavior, and there is a long history of survey researchers attempting to measure respondents’ attitudes and their impact on various survey outcomes (e.g., Goyder, 1986). Particular attention has been given to respondents’ attitudes about privacy and confidentiality. Research conducted by the US Census Bureau going back to the 1990s demonstrates that concerns about personal privacy and data confidentiality have increased in the general public and that these attitudes are associated with lower participation rates in the decennial census (Singer 1993, 2003) and more negative attitudes toward the use of administrative records (Singer, Bates, and Van Hoewyk 2011). Privacy and confidentiality concerns can influence record linkage consent, as well. Both direct measures of privacy concerns (respondent self-reports) and indirect indicators (item refusals on financial questions) have been shown to be negatively associated with consent (Sakshaug et al. 2012; Sala, Knies, and Burton 2014; Mostafa 2015). Similarly, Sakshaug et al. (2012) demonstrated that the more confidentiality-related concerns respondents expressed to interviewers in a previous survey wave, the less likely they were to subsequently consent to data linkage. And Sala et al. (2014) found that concern about data confidentiality was the most frequent reason given by respondents who declined a linkage request. There also is evidence that trust (in other people, in government) and civic engagement (volunteering, political involvement) are positively related to consent (Sala et al. 2012; Al Baghal et al. 2014). 2.1.3 Saliency Respondents’ interest in topics related to the record request or their experiences with organizations that house those records can also affect consent decisions. For example, a number of studies have found that respondents have a higher propensity to accept medical consent requests when they are in poorer health or have symptoms germane to the survey subject (Woolf et al. 2000; Dunn et al. 2004; Dahlhamer and Cox 2007; Beebe et al. 2011). One explanation for this finding is that consent requests on topics salient to respondents enhance the perceived benefits of record linkage (e.g., more comprehensive medical evaluation or the general advancement of knowledge about a disease relevant to the respondent) or reduce the perceived risks (e.g., by inducing more extensive cognitive processing of the request) (Groves, Singer, and Corning 2000). In addition to topic saliency, respondents’ existing relationships with government agencies also can play a role in their consent decisions. Studies by Sala et al. (2012), Sakshaug et al. (2012), and Mostafa (2015), for example, found that individuals who received government benefits (e.g., welfare, food stamps, veterans’ benefits) were more likely to consent to economic data linkage than those who did not. These results again suggest that the salience of (and attitudes toward) service-providing government agencies may make some respondents more amenable to linkage requests involving those agencies. 2.1.4 Socio-environmental features The respondents’ environments help to shape the context in which consent decisions are made, and a handful of studies have examined associations between area characteristics and attitudes toward use of administrative records and consent decisions. For example, Singer et al. (2011) found that individuals living in the South and Mid-Atlantic regions of the country had more favorable attitudes about administrative record use by the US Census Bureau than those living in other regions of the country. Studies of actual consent and linkage rates have demonstrated regional variations as well, with higher rates in the South and Midwest and lower rates in parts of the Northeast (e.g., Olson 1999; Dahlhamer and Cox 2007). Consent rates also can vary by urban status. Consistent with urbanicity effects seen in the literature on survey participation and pro-social behavior (Groves and Couper, 1998; Mattis, Hammond, Grayman, Bonacci, Brennan, et al. 2009), respondents living in urban areas have been found to be less likely to consent than those living in non-urban areas (c.f. Jenkins et al 2006; Dahlhamer and Cox 2007; Al Baghal et al. 2014, who show a marginally significant positive effect for urbanicity). Together such area effects may indicate the influence of underlying ecological factors within those communities (e.g., differences in population density, crime, social engagement), but may also reflect differences in survey operations (e.g., in staff, protocol, and training) clustered within those geographic areas. A recent study by Mostafa (2015) found area characteristics by themselves added little explanatory power to models of consent propensity, suggesting that respondent and interview characteristics may be more important factors. 2.1.5 Interviewer characteristics Interviewer attributes and behaviors can have significant impact on survey participation and data quality (O’Muircheartaigh and Campanelli 1999; West, Kreuter, and Jaenichen 2013), including linkage consent decisions. Studies investigating the impact of interviewer demographics generally find that they are unrelated to the consent outcome (Sakshaug et al. 2012; Sala et al. 2012), although there is some evidence of a positive effect of interviewer age on consent (Krobmacher and Schroeder 2013; Al Baghal et al. 2014). Interviewer experience has shown mixed effects. The amount of time spent working as an interviewer overall (i.e., job tenure) is either unrelated to consent (Sakshaug et al. 2012; Sala et al. 2012) or can actually have a small negative impact (Sakshaug et al. 2013; Al Baghal et al. 2014). Interviewers’ survey-specific experience, as measured by the number of interviews already completed prior to the current consent request, shows similar effects (Sakshaug et al. 2012; Sala et al. 2012). One aspect of interviewer experience that is positively related to consent in these studies is past performance in gaining respondent consent. Sala et al. (2012) and Sakshaug et al. (2012) found that the likelihood of consent increased with the number of consents obtained earlier in the field period. These authors also attempted to identify interviewer personality traits and attitudes that could affect respondent consent decisions, but largely failed to find significant effects. The one exception was that respondent consent was positively related to interviewers’ own willingness to consent to linkage (Sakshaug et al. 2012). 2.1.6 Survey design features The way in which the consent requests are administered can impact linkage consent. Consent rates appear to be higher in face-to-face surveys than in phone surveys, though there are relatively few mode studies that have examined this phenomenon (Fulton 2012). Consent questions that ask respondents to provide personal identifiers (e.g., full or partial SSNs) as matching variables produce lower consent rates than those that do not (Bates 2005). This finding and advancements in statistical matching techniques prompted the Census Bureau to change its approach to gaining linkage consent in 2006, and it has since adopted a passive, opt-out consent procedure in which respondents are informed of the intent to link, and consent is assumed unless respondents explicitly object (McNabb, Timmons, Song, and Puckett 2009). These implicit consent procedures (as they are sometimes called) result in higher consent rates than opt-in approaches where respondents must affirmatively state their consent (Bates 2005; Pascale 2011). See also Das and Couper (2014). Since most surveys employ opt-in formats, researchers have focused on the potential effects of the wording or framing of these questions. Consent framing experiments vary factors mentioned in the request that are thought to be persuasive to respondents, for example, highlighting the quality benefits associated with linkage, the reduction in survey collection costs, or the time savings for respondents. Evidence of framing effects in these studies is surprisingly weak, however. Bates, Wroblewski, and Pascale (2012) found that respondents reported more positive attitudes toward record linkage under cost- and time-savings frames, but the study did not measure actual linkage consent propensities. In Sakshaug and Kreuter (2014), a time-savings frame produced higher consent rates for web survey respondents than a neutrally worded consent question, but this is the only study in the literature to find significant question-framing effects (Pascale 2011; Sakshaug et al. 2013). The timing of consent requests appears to have some influence on likelihood of consent. Although it is common practice to delay asking the most sensitive items like linkage-consent requests until near the end of the questionnaire, recent empirical evidence indicates that this may not be optimal. Sakshaug et al. (2013) found that respondents were more amenable to consent requests administered at the beginning of the survey than at the end and suggest that the proximity of the survey and linkage requests may reinforce respondents’ desire or inclination to be consistent (i.e., agree to both). This result could inform potentially promising adaptive design interventions (e.g., asking for consent early in the interview, then skipping subsequent burdensome questions for consenters). Sala et al. (2014) obtained higher consent rates when the request was asked immediately following a series of questions on a related topic rather than waiting until the end of the survey. The authors reasoned that contextual placement of the linkage question increases the salience of the request and induces more careful consideration by respondents. Both explanations find some support in the broader psychological literature on compliance, cognitive dissonance, and context effects, but further research is needed to evaluate these and other mechanisms (e.g., survey fatigue) underlying consent placement effects (Sala et al. 2014). 2.2 Analytic Approaches to Assessing Consent Bias Early linkage consent studies simply looked for evidence of sample bias (i.e., differences in sample composition for consenters and non-consenters). Most recent studies use logistic regression models to identify factors that influence consent and infer potential consent bias (e.g., differential consent to medical-records linkage by respondent health status). Several studies have employed multi-level models to assess the impact of interviewers on consent propensity (e.g., Fulton, 2012; Sala et al., 2012; Mostafa 2015), and others have jointly modeled respondents’ consent propensities on multiple consent items in a given survey (e, g., Mostafa 2015). Studies that have examined direct estimates of consent bias, using administrative records available for both consenters and non-consenters, are much less common in the literature. Recent research by Sakshaug and his colleagues are notable exceptions (Sakshaug and Kreuter 2012; Sakshaug and Huber 2016). Using administrative data linked to German panel survey data, they have compared the magnitude of consent biases to bias estimates for other error sources (nonresponse, measurement) and the longitudinal changes in these biases. Given evidence of potential consent bias (e.g., differential consent by specific demographic groups), one promising approach that has not yet been explored in this literature is to adopt propensity weighting methods that are widely used in nonresponse adjustment. Traditionally, propensity weighted nonresponse adjustments are accomplished by modeling response propensities using logistic regression and auxiliary data available for both respondents and nonrespondents, and then using the inverse of the modeled propensity as a weight adjustment factor (Little 1986). If the predicted propensity is unbiased, this adjustment method may reduce the potential bias due to nonresponse. Of course, bias reduction is predicated on correct model specification, and this may be particularly challenging in consent propensity applications given the absence of well-developed consent theories and inconsistency in the effects of many its predictors. Application of these ideas in the analysis of “consent-to-link” patterns is complicated by two issues. First, the decision of a respondent to provide formal consent to link does not guarantee the successful linkage of that unit to a given external data source. For example, a nominally consenting respondent may fail to provide specific forms of information (e.g., account numbers) required to perform the linkage to the external source; the external source itself may be subject to incomplete-data problems; or there may be other problems with imperfect linkage as outlined in Herzog et al. (2007). Consequently, it can be useful to consider a decomposition:   pLx; βL=pCx; βC pL|Cx; βL|C (2.1) where pLx; βL is the probability that a unit with predictor variable x will ultimately have a successful linkage to a given data source; pCx; βC is the probability that this unit will provide nominal informed consent to link; pL|Cx; βL|C is the probability of successful linkage, conditional on the unit providing consent; and βL, βC and βL|C are the parameter vectors for the three respective probability models. Note that to some degree, the first factor, pCx; βC, is analogous to the probability of unit response in a standard survey setting, and the second factor pL|Cx; βL|C is analogous to the probability of section or item response. In addition, note that the conditional probabilities pL|Cx; βL|C may depend on a wide range of factors, including perceived sensitivity of a given set of linkage variables that the respondent may need to provide 2.3 Options for Exploratory Analyses of Consent-to-Link Ideally, one would explore informed-consent issues by estimating all of the parameters of model (2.1) and by evaluating potential non-consent-based biases for estimators of a large number of population parameters of interest. However, in-depth empirical work with consent for record linkage imposes a substantial burden on field data collection. In addition, large-scale implementation of record linkage incurs substantial costs related to production systems and “data cleaning” for the variables on which we intend to link and also incurs a risk of disruption of the ongoing survey production process. Consequently, it is important to identify alternative approaches that allow initial exploration of some aspects of model (2.1) with considerably lower costs and risks. For example, one could consider the following sequence of exploratory options: Simple lab studies. This step has the advantages of not disrupting production and relatively low costs. However, results may not align with “real world” production conditions (per the preceding literature results on interviewer characteristics) nor full population coverage (per the comments on respondent demographics and socio-environmental features). Addition of simple consent-to-link questions to standard production instruments. This approach may incur a relatively low risk of disruption of production and relatively low incremental costs and has the advantage of being naturally embedded in production conditions. Same as (ii) but with actual linkage to administrative records. This approach incurs some additional cost (to carry out record linkage and related data-management and cleaning processes). In addition, this option may incur some additional respondent burden arising from collection of information required to enhance the probability of a successful link. On the other hand, option (iii) allows assessment of additional linkage-related issues, (e.g., cases in which conditional linkage probabilities pL|Cx; βL|C are less than 1). Full-scale field tests of consent-to-link. This option will incur higher costs and higher risks of disruption of production processes, but will generally be considered necessary before an organization makes a final decision to implement record linkage in production processes. Also, in some cases, interviewer attitudes and behaviors may differ between cases (ii) and (iv). Due to the balance of potential costs, risks and benefits arising from options (i) through (iv), it may be useful to focus initial exploratory attention on options (i) and (ii), and then consider use of options (iii) and (iv) for cases in which the initial results indicate reasonable prospects for successful implementation. The current paper presents a case study of option (ii) for the Consumer Expenditure Survey, with principal emphasis on evaluation of the extent to which variability in the propensity to consent may lead to biases in unadjusted estimators of some commonly studied economic variables when restricted to consenting sample units; and evaluation of the extent to which simple propensity-based adjustments may reduce those potential biases. 3. DATA AND METHODS 3.1 Possible Linkage of Government Records with Sample Units from the US Consumer Expenditure Survey In this paper, we extend the survey-based approach to assessing consent propensity and consent bias in the context of a large household expenditure survey, the US Consumer Expenditure Quarterly Interview Survey (CEQ), sponsored by the Bureau of Labor Statistics (BLS). The CEQ is an ongoing, nationally representative panel survey that collects comprehensive information on a wide range of consumers’ expenditures and incomes, as well as the characteristics of those consumers. It is designed to collect one year’s worth of expenditure data from sample units through five interviews; the first interview is for bounding purposes only, and the remaining four interviews are conducted at three-month intervals.1 It is a rather long and burdensome survey, and the ability to link CEQ data to relevant administrative data sources (e.g., IRS data for income/assets) could eliminate the need to ask respondents to report some of these data themselves. In principle, linkage with administrative records also could allow one to capture information that would be difficult or impossible to collect through a survey instrument. This latter motivation is of some potential interest for CE, but at present may be somewhat secondary relative to reduction of current burden levels. The CEQ employs a complex sample using a stratified-clustered design, and each calendar quarter approximately 7,100 usable interviews are conducted. In 2011, the CEQ achieved a response rate of 71.5% (BLS 2014). The survey is administered by computer-assisted personal interviewing (CAPI), either by personal visit or by telephone. Mode selection is determined jointly by the interviewer and respondent, though personal visits are encouraged, particularly in the second and fifth interviews when more detailed financial information is collected. Telephone interviewing is conducted by the same CE interviewer assigned to the case, using the same CAPI instrument as used in the personal visit interviews. Beginning in 2011, BLS conducted research to explore the feasibility and potential impacts of integrating administrative data with CEQ survey responses. CEQ respondents who completed their final interview were asked whether they would object to combining their survey answers with data from other government agencies (Davis, Elkin, McBride, and To 2013, Section III). Nearly 80% of respondents had no objection to linkage. Although no actual data linkage occurred, we use this 2011 data to explore the extent to which prospective replacement of survey responses with administrative data could impact the quality of production estimates. To do this, we develop and compare consent propensity models that incorporate demographic, household, environmental, and attitudinal predictors suggested by the literature on linkage consent, attitudes towards administrative record use, and survey nonresponse. We then explore an approach for examining potential consent bias, by comparing full-sample, consent-only, and propensity-weight-adjusted expenditure estimates. This study used CEQ data from April 2011 through March 2012. During that period, respondents who completed their fifth and final CEQ interview were administered the following data-linkage consent item: “We’d like to produce additional statistical data, without taking up your time with more questions, by combining your survey answers with data from other government agencies. Do you have any objections?” Of the 5,037 respondents who were asked this question, 78.4 percent had no objections, 18.7 percent objected, and another 2.8 percent gave a “don’t know” response or were item nonrespondents on this item. We restrict our analyses to a dichotomous outcome indicator for the 4,893 respondents who consented or explicitly objected to the linkage request. 3.2 Consent Propensity Models We develop logistic regression models that estimate sample members’ propensity to consent to the CEQ linkage request: the dependent variable takes a value of 1 if respondents consent to linkage and a value of 0 if they do not consent. Model specifications were developed through fairly extensive exploratory analyses (e.g., examinations of descriptive statistics, theory-based bivariate logistic regressions, and stepwise logistic regressions). Some results from these exploratory analyses are provided in the online Appendix D. To account for the complex stratified sampling design of CEQ, the analyses were conducted with the SAS surveylogistic procedure. All point estimates reported in this paper are based on standard complex design weights, and all standard errors are based on balanced repeated replication (BRR) using 44 replicate weights, with a Fay factor, K = 0.5. In addition, we used F-adjusted Wald statistics to evaluate Goodness-of-fit (GOF) for our models (table 7 in Appendix C). Table 7. F-adjusted Mean Residual Goodness-of-Fit Test Model  F-adjusted goodness-of-fit test p-values  1  0.292  2  0.810  Model  F-adjusted goodness-of-fit test p-values  1  0.292  2  0.810  Table 7. F-adjusted Mean Residual Goodness-of-Fit Test Model  F-adjusted goodness-of-fit test p-values  1  0.292  2  0.810  Model  F-adjusted goodness-of-fit test p-values  1  0.292  2  0.810  Table 8. Multivariate Logistic Models Predicting Consent-to-Link (Weighted) Variable  Model 1   Model 2   Model A   Model B   Model C   Model D   Model E   Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Demographic characteristics                              Age group (32–65)                                18–32  0.3435***  0.0633  0.3389***  0.0634  0.3044***  0.0717  0.3319***  0.0742  0.3282***  0.0728  0.3425***  0.0640  0.3368***  0.0639    65 +  −0.2692***  0.0645  −0.2532***  0.0656  −0.1971**  0.0613  −0.2387***  0.0608  −0.2374***  0.0630  −0.2694***  0.0644  −0.2538***  0.0653  Gender (Male)                                Female  −0.0161  0.0426  −0.0229  0.0421  −0.0396  0.0359  −0.0388  0.0363  −0.0020  0.0427  −0.0152  0.0425  −0.0207  0.0419  Race (White)                                Non-white  0.0949  0.1264  0.0612  0.1260  −0.0781  0.1059  −0.0192  0.1056  0.0322  0.1155  0.0938  0.1269  0.0593  0.1264  Spanish interview (No)                                Yes  −0.4234  0.2923  −0.4452  0.2790  −0.4044  0.2930  −0.4271  0.3120  −0.4801  0.3118  −0.4287  0.2973  −0.4576  0.2846  Education group (HS grad)                                Less than HS  0.3097  0.1879  0.2926  0.1835  0.0142  0.1349  0.0186  0.1383  0.3317†  0.1838  0.3081  0.1885  0.2894  0.1844    Some college  0.4016**  0.1423  0.3739*  0.1389  −0.0986  0.1087  −0.0815  0.1129  0.4066**  0.1389  0.3996**  0.1442  0.3695*  0.1409    Associate’s   degree  −0.3321  0.2041  −0.3150  0.1993  −0.1109  0.1100  −0.1249  0.1145  −0.2858  0.2160  −0.3326  0.2036  −0.3160  0.1990    Bachelor’s   degree  −0.0654  0.2112  −0.0663  0.2082  0.0383  0.0719  0.0425  0.0742  −0.0230  0.2037  −0.0618  0.2127  −0.0581  0.2095    Advance   degree  −0.1747  0.2762  −0.1512  0.2773  0.1899  0.1215  0.1941  0.1235  −0.2738  0.2482  −0.1720  0.2773  −0.1452  0.2796  Home owner (Renter)                                Owner  −0.2767†  0.1453  −0.3233*  0.1436  −0.3612**  0.1277  −0.3589**  0.1314  −0.2703†  0.1397  −0.2754†  0.1448  −0.3198*  0.1433  Total expenditures (Log)  −0.0605  0.0799  −0.0083  0.0800  0.1025  0.0698  −0.0372  0.0754  −0.0419  0.0746  −0.0594  0.0802  −0.0065  0.0801  Income group                                Less than   $8,181  −0.2155***  0.0604          −0.4182***  0.0561  −0.4247***  0.0577  −0.2142**  0.0609      Income imputed (No)                                Yes  −0.1487**  0.0513                  −0.1481**  0.0510      Race × gender  −0.1874†  0.0956  −0.1953*  0.0923          −0.2268*  0.0930  −0.1873†  0.0957  −0.1946*  0.0923  Owner × education                                Less than HS  −0.4931*  0.2066  −0.4632*  0.1996          −0.4750*  0.2045  −0.4929*  0.2067  −0.4637*  0.2002    Some college  −0.6575***  0.1567  −0.6370***  0.1613          −0.6764***  0.1438  −0.6548***  0.1578  −0.6309***  0.1633    Associate’s   degree  0.3407†  0.2013  0.3402†  0.1986          0.2616  0.2090  0.3401†  0.2007  0.3387†  0.1978    Bachelor’s   degree  0.1385  0.2402  0.1398  0.2368          0.1057  0.2296  0.1358  0.2409  0.1335  0.2376   Advance degree  0.4713  0.2847  0.4226  0.2870          0.5867*  0.2583  0.4691  0.2854  0.4183  0.2890  Environmental features                              Region (Northeast)                                Midwest  0.2097†  0.1234  0.2111†  0.1163  0.2602*  0.1100  0.2457*  0.1196  0.2352†  0.1208  0.2098†  0.1232  0.2111†  0.1159    South  0.2451*  0.1190  0.2408*  0.1187  0.1841  0.1141  0.2017†  0.1149  0.2129†  0.1152  0.2446*  0.1189  0.2399*  0.1187    West  −0.2670*  0.1055  −0.2557*  0.1066  −0.2248*  0.0943  −0.2402**  0.1001  −0.2441*  0.1019  −0.2678*  0.1061  −0.2574*  0.1071  Urbanicity (Rural)                                Urban  −0.0268  0.0829  −0.0234  0.0824  −0.0253  0.0818  −0.0226  0.0833  −0.0249  0.0821  −0.0268  0.0829  −0.0233  0.0823  R attitude proxies                              Converted refusal (No)                                Yes  −0.0721  0.0756  −0.0838  0.0763              −0.0714  0.0753  −0.0818  0.0760  Effort (Moderate)                                A lot of effort  0.5454*  0.2081  0.6142**  0.2065              0.5418*  0.2101  0.6051**  0.2082    Bare   minimum   effort  −0.4879***  0.1365  −0.5721***  0.1371              −0.4842**  0.1387  −0.5626***  0.1389  Doorstep concerns (None)                                Too busy  −0.2207  0.1485  −0.2137  0.1466              −0.2192  0.1491  −0.2106  0.1477    Privacy/gov’t   concerns  −1.1895***  0.1667  −1.2241***  0.1622              −1.1891***  0.1666  −1.2231***  0.1621    Other  1.0547**  0.3261  1.0255**  0.3255              1.0549**  0.3259  1.0267**  0.3252  Doorstep concerns × effort                                Privacy × a   lot of effort  −0.5884*  0.2789  −0.5312†  0.2728              −0.5885*  0.2789  −0.5319†  0.2725    Privacy ×   minimum   effort  0.3183  0.1918  0.2601  0.1924              0.3172  0.1915  0.2581  0.1925    Busy × a lot   of effort  −0.0888  0.2736  −0.0890  0.2749              −0.0841  0.2758  −0.0785  0.2765    Busy ×   minimum   effort  0.2238  0.2096  0.2277  0.2095              0.2219  0.2114  0.2234  0.2112   Other × a lot of effort  1.0794†  0.6224  1.0477  0.6182              1.0746†  0.6244  1.0370  0.6194    Other ×   minimum   effort  −0.9002*  0.3610  −0.8777*  0.3593              −0.8964*  0.3617  −0.8693*  0.3597  Rapport (Phone)                                Personal   Visit                      0.0170  0.0428  0.0395  0.0412  Variable  Model 1   Model 2   Model A   Model B   Model C   Model D   Model E   Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Demographic characteristics                              Age group (32–65)                                18–32  0.3435***  0.0633  0.3389***  0.0634  0.3044***  0.0717  0.3319***  0.0742  0.3282***  0.0728  0.3425***  0.0640  0.3368***  0.0639    65 +  −0.2692***  0.0645  −0.2532***  0.0656  −0.1971**  0.0613  −0.2387***  0.0608  −0.2374***  0.0630  −0.2694***  0.0644  −0.2538***  0.0653  Gender (Male)                                Female  −0.0161  0.0426  −0.0229  0.0421  −0.0396  0.0359  −0.0388  0.0363  −0.0020  0.0427  −0.0152  0.0425  −0.0207  0.0419  Race (White)                                Non-white  0.0949  0.1264  0.0612  0.1260  −0.0781  0.1059  −0.0192  0.1056  0.0322  0.1155  0.0938  0.1269  0.0593  0.1264  Spanish interview (No)                                Yes  −0.4234  0.2923  −0.4452  0.2790  −0.4044  0.2930  −0.4271  0.3120  −0.4801  0.3118  −0.4287  0.2973  −0.4576  0.2846  Education group (HS grad)                                Less than HS  0.3097  0.1879  0.2926  0.1835  0.0142  0.1349  0.0186  0.1383  0.3317†  0.1838  0.3081  0.1885  0.2894  0.1844    Some college  0.4016**  0.1423  0.3739*  0.1389  −0.0986  0.1087  −0.0815  0.1129  0.4066**  0.1389  0.3996**  0.1442  0.3695*  0.1409    Associate’s   degree  −0.3321  0.2041  −0.3150  0.1993  −0.1109  0.1100  −0.1249  0.1145  −0.2858  0.2160  −0.3326  0.2036  −0.3160  0.1990    Bachelor’s   degree  −0.0654  0.2112  −0.0663  0.2082  0.0383  0.0719  0.0425  0.0742  −0.0230  0.2037  −0.0618  0.2127  −0.0581  0.2095    Advance   degree  −0.1747  0.2762  −0.1512  0.2773  0.1899  0.1215  0.1941  0.1235  −0.2738  0.2482  −0.1720  0.2773  −0.1452  0.2796  Home owner (Renter)                                Owner  −0.2767†  0.1453  −0.3233*  0.1436  −0.3612**  0.1277  −0.3589**  0.1314  −0.2703†  0.1397  −0.2754†  0.1448  −0.3198*  0.1433  Total expenditures (Log)  −0.0605  0.0799  −0.0083  0.0800  0.1025  0.0698  −0.0372  0.0754  −0.0419  0.0746  −0.0594  0.0802  −0.0065  0.0801  Income group                                Less than   $8,181  −0.2155***  0.0604          −0.4182***  0.0561  −0.4247***  0.0577  −0.2142**  0.0609      Income imputed (No)                                Yes  −0.1487**  0.0513                  −0.1481**  0.0510      Race × gender  −0.1874†  0.0956  −0.1953*  0.0923          −0.2268*  0.0930  −0.1873†  0.0957  −0.1946*  0.0923  Owner × education                                Less than HS  −0.4931*  0.2066  −0.4632*  0.1996          −0.4750*  0.2045  −0.4929*  0.2067  −0.4637*  0.2002    Some college  −0.6575***  0.1567  −0.6370***  0.1613          −0.6764***  0.1438  −0.6548***  0.1578  −0.6309***  0.1633    Associate’s   degree  0.3407†  0.2013  0.3402†  0.1986          0.2616  0.2090  0.3401†  0.2007  0.3387†  0.1978    Bachelor’s   degree  0.1385  0.2402  0.1398  0.2368          0.1057  0.2296  0.1358  0.2409  0.1335  0.2376   Advance degree  0.4713  0.2847  0.4226  0.2870          0.5867*  0.2583  0.4691  0.2854  0.4183  0.2890  Environmental features                              Region (Northeast)                                Midwest  0.2097†  0.1234  0.2111†  0.1163  0.2602*  0.1100  0.2457*  0.1196  0.2352†  0.1208  0.2098†  0.1232  0.2111†  0.1159    South  0.2451*  0.1190  0.2408*  0.1187  0.1841  0.1141  0.2017†  0.1149  0.2129†  0.1152  0.2446*  0.1189  0.2399*  0.1187    West  −0.2670*  0.1055  −0.2557*  0.1066  −0.2248*  0.0943  −0.2402**  0.1001  −0.2441*  0.1019  −0.2678*  0.1061  −0.2574*  0.1071  Urbanicity (Rural)                                Urban  −0.0268  0.0829  −0.0234  0.0824  −0.0253  0.0818  −0.0226  0.0833  −0.0249  0.0821  −0.0268  0.0829  −0.0233  0.0823  R attitude proxies                              Converted refusal (No)                                Yes  −0.0721  0.0756  −0.0838  0.0763              −0.0714  0.0753  −0.0818  0.0760  Effort (Moderate)                                A lot of effort  0.5454*  0.2081  0.6142**  0.2065              0.5418*  0.2101  0.6051**  0.2082    Bare   minimum   effort  −0.4879***  0.1365  −0.5721***  0.1371              −0.4842**  0.1387  −0.5626***  0.1389  Doorstep concerns (None)                                Too busy  −0.2207  0.1485  −0.2137  0.1466              −0.2192  0.1491  −0.2106  0.1477    Privacy/gov’t   concerns  −1.1895***  0.1667  −1.2241***  0.1622              −1.1891***  0.1666  −1.2231***  0.1621    Other  1.0547**  0.3261  1.0255**  0.3255              1.0549**  0.3259  1.0267**  0.3252  Doorstep concerns × effort                                Privacy × a   lot of effort  −0.5884*  0.2789  −0.5312†  0.2728              −0.5885*  0.2789  −0.5319†  0.2725    Privacy ×   minimum   effort  0.3183  0.1918  0.2601  0.1924              0.3172  0.1915  0.2581  0.1925    Busy × a lot   of effort  −0.0888  0.2736  −0.0890  0.2749              −0.0841  0.2758  −0.0785  0.2765    Busy ×   minimum   effort  0.2238  0.2096  0.2277  0.2095              0.2219  0.2114  0.2234  0.2112   Other × a lot of effort  1.0794†  0.6224  1.0477  0.6182              1.0746†  0.6244  1.0370  0.6194    Other ×   minimum   effort  −0.9002*  0.3610  −0.8777*  0.3593              −0.8964*  0.3617  −0.8693*  0.3597  Rapport (Phone)                                Personal   Visit                      0.0170  0.0428  0.0395  0.0412  Note.—†p < 0.10, *p < 0.05, **p < 0.01, ***p < 0.001. Table 8. Multivariate Logistic Models Predicting Consent-to-Link (Weighted) Variable  Model 1   Model 2   Model A   Model B   Model C   Model D   Model E   Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Demographic characteristics                              Age group (32–65)                                18–32  0.3435***  0.0633  0.3389***  0.0634  0.3044***  0.0717  0.3319***  0.0742  0.3282***  0.0728  0.3425***  0.0640  0.3368***  0.0639    65 +  −0.2692***  0.0645  −0.2532***  0.0656  −0.1971**  0.0613  −0.2387***  0.0608  −0.2374***  0.0630  −0.2694***  0.0644  −0.2538***  0.0653  Gender (Male)                                Female  −0.0161  0.0426  −0.0229  0.0421  −0.0396  0.0359  −0.0388  0.0363  −0.0020  0.0427  −0.0152  0.0425  −0.0207  0.0419  Race (White)                                Non-white  0.0949  0.1264  0.0612  0.1260  −0.0781  0.1059  −0.0192  0.1056  0.0322  0.1155  0.0938  0.1269  0.0593  0.1264  Spanish interview (No)                                Yes  −0.4234  0.2923  −0.4452  0.2790  −0.4044  0.2930  −0.4271  0.3120  −0.4801  0.3118  −0.4287  0.2973  −0.4576  0.2846  Education group (HS grad)                                Less than HS  0.3097  0.1879  0.2926  0.1835  0.0142  0.1349  0.0186  0.1383  0.3317†  0.1838  0.3081  0.1885  0.2894  0.1844    Some college  0.4016**  0.1423  0.3739*  0.1389  −0.0986  0.1087  −0.0815  0.1129  0.4066**  0.1389  0.3996**  0.1442  0.3695*  0.1409    Associate’s   degree  −0.3321  0.2041  −0.3150  0.1993  −0.1109  0.1100  −0.1249  0.1145  −0.2858  0.2160  −0.3326  0.2036  −0.3160  0.1990    Bachelor’s   degree  −0.0654  0.2112  −0.0663  0.2082  0.0383  0.0719  0.0425  0.0742  −0.0230  0.2037  −0.0618  0.2127  −0.0581  0.2095    Advance   degree  −0.1747  0.2762  −0.1512  0.2773  0.1899  0.1215  0.1941  0.1235  −0.2738  0.2482  −0.1720  0.2773  −0.1452  0.2796  Home owner (Renter)                                Owner  −0.2767†  0.1453  −0.3233*  0.1436  −0.3612**  0.1277  −0.3589**  0.1314  −0.2703†  0.1397  −0.2754†  0.1448  −0.3198*  0.1433  Total expenditures (Log)  −0.0605  0.0799  −0.0083  0.0800  0.1025  0.0698  −0.0372  0.0754  −0.0419  0.0746  −0.0594  0.0802  −0.0065  0.0801  Income group                                Less than   $8,181  −0.2155***  0.0604          −0.4182***  0.0561  −0.4247***  0.0577  −0.2142**  0.0609      Income imputed (No)                                Yes  −0.1487**  0.0513                  −0.1481**  0.0510      Race × gender  −0.1874†  0.0956  −0.1953*  0.0923          −0.2268*  0.0930  −0.1873†  0.0957  −0.1946*  0.0923  Owner × education                                Less than HS  −0.4931*  0.2066  −0.4632*  0.1996          −0.4750*  0.2045  −0.4929*  0.2067  −0.4637*  0.2002    Some college  −0.6575***  0.1567  −0.6370***  0.1613          −0.6764***  0.1438  −0.6548***  0.1578  −0.6309***  0.1633    Associate’s   degree  0.3407†  0.2013  0.3402†  0.1986          0.2616  0.2090  0.3401†  0.2007  0.3387†  0.1978    Bachelor’s   degree  0.1385  0.2402  0.1398  0.2368          0.1057  0.2296  0.1358  0.2409  0.1335  0.2376   Advance degree  0.4713  0.2847  0.4226  0.2870          0.5867*  0.2583  0.4691  0.2854  0.4183  0.2890  Environmental features                              Region (Northeast)                                Midwest  0.2097†  0.1234  0.2111†  0.1163  0.2602*  0.1100  0.2457*  0.1196  0.2352†  0.1208  0.2098†  0.1232  0.2111†  0.1159    South  0.2451*  0.1190  0.2408*  0.1187  0.1841  0.1141  0.2017†  0.1149  0.2129†  0.1152  0.2446*  0.1189  0.2399*  0.1187    West  −0.2670*  0.1055  −0.2557*  0.1066  −0.2248*  0.0943  −0.2402**  0.1001  −0.2441*  0.1019  −0.2678*  0.1061  −0.2574*  0.1071  Urbanicity (Rural)                                Urban  −0.0268  0.0829  −0.0234  0.0824  −0.0253  0.0818  −0.0226  0.0833  −0.0249  0.0821  −0.0268  0.0829  −0.0233  0.0823  R attitude proxies                              Converted refusal (No)                                Yes  −0.0721  0.0756  −0.0838  0.0763              −0.0714  0.0753  −0.0818  0.0760  Effort (Moderate)                                A lot of effort  0.5454*  0.2081  0.6142**  0.2065              0.5418*  0.2101  0.6051**  0.2082    Bare   minimum   effort  −0.4879***  0.1365  −0.5721***  0.1371              −0.4842**  0.1387  −0.5626***  0.1389  Doorstep concerns (None)                                Too busy  −0.2207  0.1485  −0.2137  0.1466              −0.2192  0.1491  −0.2106  0.1477    Privacy/gov’t   concerns  −1.1895***  0.1667  −1.2241***  0.1622              −1.1891***  0.1666  −1.2231***  0.1621    Other  1.0547**  0.3261  1.0255**  0.3255              1.0549**  0.3259  1.0267**  0.3252  Doorstep concerns × effort                                Privacy × a   lot of effort  −0.5884*  0.2789  −0.5312†  0.2728              −0.5885*  0.2789  −0.5319†  0.2725    Privacy ×   minimum   effort  0.3183  0.1918  0.2601  0.1924              0.3172  0.1915  0.2581  0.1925    Busy × a lot   of effort  −0.0888  0.2736  −0.0890  0.2749              −0.0841  0.2758  −0.0785  0.2765    Busy ×   minimum   effort  0.2238  0.2096  0.2277  0.2095              0.2219  0.2114  0.2234  0.2112   Other × a lot of effort  1.0794†  0.6224  1.0477  0.6182              1.0746†  0.6244  1.0370  0.6194    Other ×   minimum   effort  −0.9002*  0.3610  −0.8777*  0.3593              −0.8964*  0.3617  −0.8693*  0.3597  Rapport (Phone)                                Personal   Visit                      0.0170  0.0428  0.0395  0.0412  Variable  Model 1   Model 2   Model A   Model B   Model C   Model D   Model E   Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Demographic characteristics                              Age group (32–65)                                18–32  0.3435***  0.0633  0.3389***  0.0634  0.3044***  0.0717  0.3319***  0.0742  0.3282***  0.0728  0.3425***  0.0640  0.3368***  0.0639    65 +  −0.2692***  0.0645  −0.2532***  0.0656  −0.1971**  0.0613  −0.2387***  0.0608  −0.2374***  0.0630  −0.2694***  0.0644  −0.2538***  0.0653  Gender (Male)                                Female  −0.0161  0.0426  −0.0229  0.0421  −0.0396  0.0359  −0.0388  0.0363  −0.0020  0.0427  −0.0152  0.0425  −0.0207  0.0419  Race (White)                                Non-white  0.0949  0.1264  0.0612  0.1260  −0.0781  0.1059  −0.0192  0.1056  0.0322  0.1155  0.0938  0.1269  0.0593  0.1264  Spanish interview (No)                                Yes  −0.4234  0.2923  −0.4452  0.2790  −0.4044  0.2930  −0.4271  0.3120  −0.4801  0.3118  −0.4287  0.2973  −0.4576  0.2846  Education group (HS grad)                                Less than HS  0.3097  0.1879  0.2926  0.1835  0.0142  0.1349  0.0186  0.1383  0.3317†  0.1838  0.3081  0.1885  0.2894  0.1844    Some college  0.4016**  0.1423  0.3739*  0.1389  −0.0986  0.1087  −0.0815  0.1129  0.4066**  0.1389  0.3996**  0.1442  0.3695*  0.1409    Associate’s   degree  −0.3321  0.2041  −0.3150  0.1993  −0.1109  0.1100  −0.1249  0.1145  −0.2858  0.2160  −0.3326  0.2036  −0.3160  0.1990    Bachelor’s   degree  −0.0654  0.2112  −0.0663  0.2082  0.0383  0.0719  0.0425  0.0742  −0.0230  0.2037  −0.0618  0.2127  −0.0581  0.2095    Advance   degree  −0.1747  0.2762  −0.1512  0.2773  0.1899  0.1215  0.1941  0.1235  −0.2738  0.2482  −0.1720  0.2773  −0.1452  0.2796  Home owner (Renter)                                Owner  −0.2767†  0.1453  −0.3233*  0.1436  −0.3612**  0.1277  −0.3589**  0.1314  −0.2703†  0.1397  −0.2754†  0.1448  −0.3198*  0.1433  Total expenditures (Log)  −0.0605  0.0799  −0.0083  0.0800  0.1025  0.0698  −0.0372  0.0754  −0.0419  0.0746  −0.0594  0.0802  −0.0065  0.0801  Income group                                Less than   $8,181  −0.2155***  0.0604          −0.4182***  0.0561  −0.4247***  0.0577  −0.2142**  0.0609      Income imputed (No)                                Yes  −0.1487**  0.0513                  −0.1481**  0.0510      Race × gender  −0.1874†  0.0956  −0.1953*  0.0923          −0.2268*  0.0930  −0.1873†  0.0957  −0.1946*  0.0923  Owner × education                                Less than HS  −0.4931*  0.2066  −0.4632*  0.1996          −0.4750*  0.2045  −0.4929*  0.2067  −0.4637*  0.2002    Some college  −0.6575***  0.1567  −0.6370***  0.1613          −0.6764***  0.1438  −0.6548***  0.1578  −0.6309***  0.1633    Associate’s   degree  0.3407†  0.2013  0.3402†  0.1986          0.2616  0.2090  0.3401†  0.2007  0.3387†  0.1978    Bachelor’s   degree  0.1385  0.2402  0.1398  0.2368          0.1057  0.2296  0.1358  0.2409  0.1335  0.2376   Advance degree  0.4713  0.2847  0.4226  0.2870          0.5867*  0.2583  0.4691  0.2854  0.4183  0.2890  Environmental features                              Region (Northeast)                                Midwest  0.2097†  0.1234  0.2111†  0.1163  0.2602*  0.1100  0.2457*  0.1196  0.2352†  0.1208  0.2098†  0.1232  0.2111†  0.1159    South  0.2451*  0.1190  0.2408*  0.1187  0.1841  0.1141  0.2017†  0.1149  0.2129†  0.1152  0.2446*  0.1189  0.2399*  0.1187    West  −0.2670*  0.1055  −0.2557*  0.1066  −0.2248*  0.0943  −0.2402**  0.1001  −0.2441*  0.1019  −0.2678*  0.1061  −0.2574*  0.1071  Urbanicity (Rural)                                Urban  −0.0268  0.0829  −0.0234  0.0824  −0.0253  0.0818  −0.0226  0.0833  −0.0249  0.0821  −0.0268  0.0829  −0.0233  0.0823  R attitude proxies                              Converted refusal (No)                                Yes  −0.0721  0.0756  −0.0838  0.0763              −0.0714  0.0753  −0.0818  0.0760  Effort (Moderate)                                A lot of effort  0.5454*  0.2081  0.6142**  0.2065              0.5418*  0.2101  0.6051**  0.2082    Bare   minimum   effort  −0.4879***  0.1365  −0.5721***  0.1371              −0.4842**  0.1387  −0.5626***  0.1389  Doorstep concerns (None)                                Too busy  −0.2207  0.1485  −0.2137  0.1466              −0.2192  0.1491  −0.2106  0.1477    Privacy/gov’t   concerns  −1.1895***  0.1667  −1.2241***  0.1622              −1.1891***  0.1666  −1.2231***  0.1621    Other  1.0547**  0.3261  1.0255**  0.3255              1.0549**  0.3259  1.0267**  0.3252  Doorstep concerns × effort                                Privacy × a   lot of effort  −0.5884*  0.2789  −0.5312†  0.2728              −0.5885*  0.2789  −0.5319†  0.2725    Privacy ×   minimum   effort  0.3183  0.1918  0.2601  0.1924              0.3172  0.1915  0.2581  0.1925    Busy × a lot   of effort  −0.0888  0.2736  −0.0890  0.2749              −0.0841  0.2758  −0.0785  0.2765    Busy ×   minimum   effort  0.2238  0.2096  0.2277  0.2095              0.2219  0.2114  0.2234  0.2112   Other × a lot of effort  1.0794†  0.6224  1.0477  0.6182              1.0746†  0.6244  1.0370  0.6194    Other ×   minimum   effort  −0.9002*  0.3610  −0.8777*  0.3593              −0.8964*  0.3617  −0.8693*  0.3597  Rapport (Phone)                                Personal   Visit                      0.0170  0.0428  0.0395  0.0412  Note.—†p < 0.10, *p < 0.05, **p < 0.01, ***p < 0.001. Table 9. F-adjusted Mean Residual Goodness-of-Fit Test p-values and test-statistic Model of Consent/Object Comparison Model (of consent) 2nd revision  F-adjusted Goodness-of-fit test p-values (test-statistic F(9, 35))  1  0.292 (1.262)  2  0.810 (0.573)  A  0.336 (1.183)  B  0.929 (0.396)  C  0.061 (2.061)  D  0.022 (2.576)  E  0.968 (0.305)  Model (of consent) 2nd revision  F-adjusted Goodness-of-fit test p-values (test-statistic F(9, 35))  1  0.292 (1.262)  2  0.810 (0.573)  A  0.336 (1.183)  B  0.929 (0.396)  C  0.061 (2.061)  D  0.022 (2.576)  E  0.968 (0.305)  Table 9. F-adjusted Mean Residual Goodness-of-Fit Test p-values and test-statistic Model of Consent/Object Comparison Model (of consent) 2nd revision  F-adjusted Goodness-of-fit test p-values (test-statistic F(9, 35))  1  0.292 (1.262)  2  0.810 (0.573)  A  0.336 (1.183)  B  0.929 (0.396)  C  0.061 (2.061)  D  0.022 (2.576)  E  0.968 (0.305)  Model (of consent) 2nd revision  F-adjusted Goodness-of-fit test p-values (test-statistic F(9, 35))  1  0.292 (1.262)  2  0.810 (0.573)  A  0.336 (1.183)  B  0.929 (0.396)  C  0.061 (2.061)  D  0.022 (2.576)  E  0.968 (0.305)  3.3 Using Propensity Adjustments to Explore Reductions in Consent Bias To examine potential consent bias in our data, we focus on six CEQ variables that (a) are from sensitive or burdensome sections of the survey and (b) could potentially be substituted with information available in administrative records. These variables were before-tax income, before-tax income with imputed values, property tax, vehicle purchase amount, property value, and rental value. For these exploratory analyses, we treat the reported CEQ values as “truth” and compare estimates from the full sample to those derived from consenters only. Despite the limitations of this approach (e.g., there is almost certainly some amount of measurement error in reported values), it provides a means of examining consent bias indirectly without incurring the costs and production disruptions of fully matching survey responses with administrative data. We apply propensity adjustments to estimates for these variables based on the estimated consent propensity scores calculated for each respondent (weighting each respondent by the inverse of their consent propensity; see additional details in section 4.2). We then compare full-sample estimates to those from the weight-adjusted consenters-only to explore the extent to which adjustment techniques can reduce consent bias. This procedure is analogous to propensity-adjustment weighting methods commonly used to reduce other sources of bias in sample surveys (nonresponse, coverage) (e.g., Groves, Dillman, Eltinge, and Little 2002). For this analysis, we used propensity model 2, for two reasons. First, in keeping with the general approach to propensity modeling for incomplete data (e.g., Rosenbaum and Rubin 1983), we especially wished to condition on predictor variables that may potentially be associated with both the consent decision and the underlying economic variables of interest. Because one generally would seek to use the same propensity model for adjusted estimation for each economic variable of interest, and different economic variables may display different patterns of association with the candidate predictors, one is inclined to be relatively inclusive in the choice of predictor variables in the propensity model. Second, an important exception to this general approach is that one would not wish to include predictor variables that depend directly on reported income variables; consequently, for the propensity-weighting work, we used model 2 (which excludes the low-income and imputed-income indicators used in model 1). 4. RESULTS 4.1 Consent-to-Linkage Predictors We begin by examining indicators of the proposed mechanisms of consent. Table 1 shows the weighted percentages (means) and standard errors for each indicator for the full sample and separately for consenters and non-consenters. Table 1. Weighted Estimates of Indicators of Hypothesized Consent Mechanisms Indicator  All respondents (n = 4893)  Consenters (n = 3951)  Non-consenters (n = 942)  Privacy/anti-government concerns***  6.9 (0.8)  4.2 (0.6)  18.1 (2.2)  Reluctance          Converted refusal***  10.8 (0.7)  9.6 (0.8)  16.0 (1.5)    Income imputation required***  44.7 (1.0)  41.8 (1.1)  57.0 (2.1)    Too busy**  8.1 (0.5)  6.9 (0.6)  12.7 (1.3)    Respondent effort***           A lot of effort  34.9 (0.9)  36.7 (1.0)  27.3 (1.9)     Moderate effort  37.2 (1.1)  37.4 (1.3)  36.4 (1.8)     Bare minimum effort  27.9 (1.1)  25.9 (1.2)  36.3 (2.3)   Respondent cooperation***           Very cooperative  50.8 (1.0)  53.5 (1.0)  39.4 (2.6)     Somewhat cooperative  33.1 (0.8)  33.1 (0.9)  33.1 (1.3)     Neither cooperative nor    uncooperative  5.4 (0.5)  4.5 (0.4)  9.2 (1.1)     Somewhat uncooperative  4.5 (0.3)  3.3 (0.3)  9.5 (1.1)     Very uncooperative  6.3 (0.4)  5.7 (0.5)  8.8 (0.9)  Rapport***          Face-to-face  65.0 (1.4)  66.2 (1.5)  59.9 (1.9)    Phone  35.0 (1.4)  33.8 (1.5)  40.1 (1.9)  Burden          Total interview time  64.9 (1.0)  65.3 (1.1)  63.3 (1.4)    Household size  2.42 (0.03)  2.42 (0.03)  2.42 (0.07)    Total expenditures  $11,454 ($148)  $11,511 ($145)  $11,218 ($384)    Family income***           LT $8,180  19.8 (0.8)  17.4 (0.9)  30.8 (1.9)     $8,180–$24,000  20.2 (0.6)  20.8 (0.9)  16.8 (1.4)     $24,001–$46,000  20.8 (0.6)  20.5 (0.7)  18.4 (1.3)     $46,001–$85,855  19.8 (0.6)  20.7 (0.7)  16.7 (1.3)     GT $85,585  19.4 (0.6)  20.6 (0.9)  17.3 (1.8)    Owner***  65.4 (0.8)  64.0 (0.9)  71.3 (1.6)  Indicator  All respondents (n = 4893)  Consenters (n = 3951)  Non-consenters (n = 942)  Privacy/anti-government concerns***  6.9 (0.8)  4.2 (0.6)  18.1 (2.2)  Reluctance          Converted refusal***  10.8 (0.7)  9.6 (0.8)  16.0 (1.5)    Income imputation required***  44.7 (1.0)  41.8 (1.1)  57.0 (2.1)    Too busy**  8.1 (0.5)  6.9 (0.6)  12.7 (1.3)    Respondent effort***           A lot of effort  34.9 (0.9)  36.7 (1.0)  27.3 (1.9)     Moderate effort  37.2 (1.1)  37.4 (1.3)  36.4 (1.8)     Bare minimum effort  27.9 (1.1)  25.9 (1.2)  36.3 (2.3)   Respondent cooperation***           Very cooperative  50.8 (1.0)  53.5 (1.0)  39.4 (2.6)     Somewhat cooperative  33.1 (0.8)  33.1 (0.9)  33.1 (1.3)     Neither cooperative nor    uncooperative  5.4 (0.5)  4.5 (0.4)  9.2 (1.1)     Somewhat uncooperative  4.5 (0.3)  3.3 (0.3)  9.5 (1.1)     Very uncooperative  6.3 (0.4)  5.7 (0.5)  8.8 (0.9)  Rapport***          Face-to-face  65.0 (1.4)  66.2 (1.5)  59.9 (1.9)    Phone  35.0 (1.4)  33.8 (1.5)  40.1 (1.9)  Burden          Total interview time  64.9 (1.0)  65.3 (1.1)  63.3 (1.4)    Household size  2.42 (0.03)  2.42 (0.03)  2.42 (0.07)    Total expenditures  $11,454 ($148)  $11,511 ($145)  $11,218 ($384)    Family income***           LT $8,180  19.8 (0.8)  17.4 (0.9)  30.8 (1.9)     $8,180–$24,000  20.2 (0.6)  20.8 (0.9)  16.8 (1.4)     $24,001–$46,000  20.8 (0.6)  20.5 (0.7)  18.4 (1.3)     $46,001–$85,855  19.8 (0.6)  20.7 (0.7)  16.7 (1.3)     GT $85,585  19.4 (0.6)  20.6 (0.9)  17.3 (1.8)    Owner***  65.4 (0.8)  64.0 (0.9)  71.3 (1.6)  Note.—Standard errors shown in parentheses. Asterisks denote statistically significant differences. (**p < 0.01; ***p < 0.0001). Table 1. Weighted Estimates of Indicators of Hypothesized Consent Mechanisms Indicator  All respondents (n = 4893)  Consenters (n = 3951)  Non-consenters (n = 942)  Privacy/anti-government concerns***  6.9 (0.8)  4.2 (0.6)  18.1 (2.2)  Reluctance          Converted refusal***  10.8 (0.7)  9.6 (0.8)  16.0 (1.5)    Income imputation required***  44.7 (1.0)  41.8 (1.1)  57.0 (2.1)    Too busy**  8.1 (0.5)  6.9 (0.6)  12.7 (1.3)    Respondent effort***           A lot of effort  34.9 (0.9)  36.7 (1.0)  27.3 (1.9)     Moderate effort  37.2 (1.1)  37.4 (1.3)  36.4 (1.8)     Bare minimum effort  27.9 (1.1)  25.9 (1.2)  36.3 (2.3)   Respondent cooperation***           Very cooperative  50.8 (1.0)  53.5 (1.0)  39.4 (2.6)     Somewhat cooperative  33.1 (0.8)  33.1 (0.9)  33.1 (1.3)     Neither cooperative nor    uncooperative  5.4 (0.5)  4.5 (0.4)  9.2 (1.1)     Somewhat uncooperative  4.5 (0.3)  3.3 (0.3)  9.5 (1.1)     Very uncooperative  6.3 (0.4)  5.7 (0.5)  8.8 (0.9)  Rapport***          Face-to-face  65.0 (1.4)  66.2 (1.5)  59.9 (1.9)    Phone  35.0 (1.4)  33.8 (1.5)  40.1 (1.9)  Burden          Total interview time  64.9 (1.0)  65.3 (1.1)  63.3 (1.4)    Household size  2.42 (0.03)  2.42 (0.03)  2.42 (0.07)    Total expenditures  $11,454 ($148)  $11,511 ($145)  $11,218 ($384)    Family income***           LT $8,180  19.8 (0.8)  17.4 (0.9)  30.8 (1.9)     $8,180–$24,000  20.2 (0.6)  20.8 (0.9)  16.8 (1.4)     $24,001–$46,000  20.8 (0.6)  20.5 (0.7)  18.4 (1.3)     $46,001–$85,855  19.8 (0.6)  20.7 (0.7)  16.7 (1.3)     GT $85,585  19.4 (0.6)  20.6 (0.9)  17.3 (1.8)    Owner***  65.4 (0.8)  64.0 (0.9)  71.3 (1.6)  Indicator  All respondents (n = 4893)  Consenters (n = 3951)  Non-consenters (n = 942)  Privacy/anti-government concerns***  6.9 (0.8)  4.2 (0.6)  18.1 (2.2)  Reluctance          Converted refusal***  10.8 (0.7)  9.6 (0.8)  16.0 (1.5)    Income imputation required***  44.7 (1.0)  41.8 (1.1)  57.0 (2.1)    Too busy**  8.1 (0.5)  6.9 (0.6)  12.7 (1.3)    Respondent effort***           A lot of effort  34.9 (0.9)  36.7 (1.0)  27.3 (1.9)     Moderate effort  37.2 (1.1)  37.4 (1.3)  36.4 (1.8)     Bare minimum effort  27.9 (1.1)  25.9 (1.2)  36.3 (2.3)   Respondent cooperation***           Very cooperative  50.8 (1.0)  53.5 (1.0)  39.4 (2.6)     Somewhat cooperative  33.1 (0.8)  33.1 (0.9)  33.1 (1.3)     Neither cooperative nor    uncooperative  5.4 (0.5)  4.5 (0.4)  9.2 (1.1)     Somewhat uncooperative  4.5 (0.3)  3.3 (0.3)  9.5 (1.1)     Very uncooperative  6.3 (0.4)  5.7 (0.5)  8.8 (0.9)  Rapport***          Face-to-face  65.0 (1.4)  66.2 (1.5)  59.9 (1.9)    Phone  35.0 (1.4)  33.8 (1.5)  40.1 (1.9)  Burden          Total interview time  64.9 (1.0)  65.3 (1.1)  63.3 (1.4)    Household size  2.42 (0.03)  2.42 (0.03)  2.42 (0.07)    Total expenditures  $11,454 ($148)  $11,511 ($145)  $11,218 ($384)    Family income***           LT $8,180  19.8 (0.8)  17.4 (0.9)  30.8 (1.9)     $8,180–$24,000  20.2 (0.6)  20.8 (0.9)  16.8 (1.4)     $24,001–$46,000  20.8 (0.6)  20.5 (0.7)  18.4 (1.3)     $46,001–$85,855  19.8 (0.6)  20.7 (0.7)  16.7 (1.3)     GT $85,585  19.4 (0.6)  20.6 (0.9)  17.3 (1.8)    Owner***  65.4 (0.8)  64.0 (0.9)  71.3 (1.6)  Note.—Standard errors shown in parentheses. Asterisks denote statistically significant differences. (**p < 0.01; ***p < 0.0001). We find evidence that privacy concerns are more prevalent among respondents who objected to the linkage request than with those who gave their consent (18.1% versus 4.2%), consistent with our hypothesis. There is also support at the bivariate level for a reluctance mechanism: non-consenters were significantly more likely than those who consented to the record linkage request to require refusal conversions (16.0% versus 9.6 %) and income imputation (57.0% versus 41.8%); express concerns about the time required by the survey (12.7% versus 6.9%); or to be rated as less effortful and cooperative by CEQ interviewers. Additionally, non-consenting respondents had a higher proportion of phone interviews (relative to in-person interviews) than those who consented to data linkage (40.1% versus 33.8%), consistent with a rapport hypothesis (i.e., the difficulty of achieving and maintaining rapport over the phone reduces consent propensity relative to in-person interviews). Evidence for the role of burden in table 1 is mixed. We examined the metric most commonly used to measure burden in the literature (interview duration), as well as several other variables that typically increase the length and/or difficulty of the CEQ interview (household size, total expenditures, family income, and home ownership). Contrary to expectations, non-consenters and consenters did not differ significantly in household size (2.26 versus 2.26), interview duration (63.3 minutes versus 65.3 minutes), or in total household spending ($11,218 versus $11,511), though the direction of the latter two findings is consistent with a burden hypothesis (consent would be highest for those most burdened). There were significant differences between consenters and non-consenters in family income and home ownership status, but the effects were in opposite directions. Non-consenters were more highly concentrated in the lowest income group (30.8% of non-consenters had family income under $8,180 versus 17.4% of consenters), but also had a higher proportion of home ownership than consenters (71.3% versus 64.0%). Table 2 presents the weighted percentages and standard errors for the basic respondent demographics and area characteristics. Overall, there were few differences in sample composition between consenters and non-consenters. There were no significant consent rate differences by respondent gender, race, education, language of the interview, or urban status. Non-consenters did skew significantly older than those who provided consent (e.g., 24.6% of non-consenters were 65 or older versus 19.9% for consenters) and were more likely than consenters to live in the Northeast (22.2% versus 17.6%) and in the West (27.1% versus 22.0%). Table 2. Characteristics of Sample Respondents Characteristic  All respondents (n = 4,893)  Consenters (n = 3,951)  Non-consenters (n = 942)  Age group***         18–32  20.0 (0.9)  21.5 (1.1)  13.8 (1.0)   32–65  59.2 (0.9)  58.6 (1.2)  61.6 (1.5)   65+  20.8 (0.7)  19.9 (0.8)  24.6 (1.5)  Gender         Male  46.3 (0.7)  46.8 (0.8)  44.3 (1.7)   Female  53.7 (0.7)  53.2 (0.8)  55.7 (1.7)  Race         White  83.1 (0.9)  83.0 (1.0)  83.1 (1.1)   Non-white  16.9 (0.9)  17.0 (1.0)  16.9 (1.1)  Education group         Less than HS  13.5 (0.6)  13.4 (0.7)  14.0 (1.6)   HS graduate  24.7 (0.8)  24.5 (0.9)  25.3 (1.8)   Some college  21.4 (0.7)  21.2 (0.9)  22.2 (1.9)   Associates degree  10.2 (0.4)  10.0 (0.5)  10.9 (1.1)   Bachelor’s degree  19.1 (0.5)  19.4 (0.6)  17.9 (1.4)   Advance degree  11.1 (0.5)  11.5 (0.6)  9.7 (0.9)  Spanish language interview         Yes  4.1 (0.5)  3.7 (0.5)  5.4 (1.5)   No  95.9 (0.5)  96.3 (0.5)  94.6 (1.5)  Region**         Northeast  18.5 (0.6)  17.6 (0.7)  22.2 (1.7)   Midwest  23.5 (1.3)  24.5 (1.4)  19.2 (2.3)   South  35.0 (1.3)  35.9 (1.5)  31.5 (2.7)   West  23.0 (1.5)  22.0 (1.7)  27.1 (1.8)  Urbanicity         Urban  80.5 (0.9)  80.5 (1.2)  80.7 (2.0)   Rural  19.5 (0.9)  19.5 (1.2)  19.3 (2.0)  Characteristic  All respondents (n = 4,893)  Consenters (n = 3,951)  Non-consenters (n = 942)  Age group***         18–32  20.0 (0.9)  21.5 (1.1)  13.8 (1.0)   32–65  59.2 (0.9)  58.6 (1.2)  61.6 (1.5)   65+  20.8 (0.7)  19.9 (0.8)  24.6 (1.5)  Gender         Male  46.3 (0.7)  46.8 (0.8)  44.3 (1.7)   Female  53.7 (0.7)  53.2 (0.8)  55.7 (1.7)  Race         White  83.1 (0.9)  83.0 (1.0)  83.1 (1.1)   Non-white  16.9 (0.9)  17.0 (1.0)  16.9 (1.1)  Education group         Less than HS  13.5 (0.6)  13.4 (0.7)  14.0 (1.6)   HS graduate  24.7 (0.8)  24.5 (0.9)  25.3 (1.8)   Some college  21.4 (0.7)  21.2 (0.9)  22.2 (1.9)   Associates degree  10.2 (0.4)  10.0 (0.5)  10.9 (1.1)   Bachelor’s degree  19.1 (0.5)  19.4 (0.6)  17.9 (1.4)   Advance degree  11.1 (0.5)  11.5 (0.6)  9.7 (0.9)  Spanish language interview         Yes  4.1 (0.5)  3.7 (0.5)  5.4 (1.5)   No  95.9 (0.5)  96.3 (0.5)  94.6 (1.5)  Region**         Northeast  18.5 (0.6)  17.6 (0.7)  22.2 (1.7)   Midwest  23.5 (1.3)  24.5 (1.4)  19.2 (2.3)   South  35.0 (1.3)  35.9 (1.5)  31.5 (2.7)   West  23.0 (1.5)  22.0 (1.7)  27.1 (1.8)  Urbanicity         Urban  80.5 (0.9)  80.5 (1.2)  80.7 (2.0)   Rural  19.5 (0.9)  19.5 (1.2)  19.3 (2.0)  Note.—Standard errors shown in parentheses. Asterisks denote statistically significant differences. (** p < 0.01; ***p < .0001). Table 2. Characteristics of Sample Respondents Characteristic  All respondents (n = 4,893)  Consenters (n = 3,951)  Non-consenters (n = 942)  Age group***         18–32  20.0 (0.9)  21.5 (1.1)  13.8 (1.0)   32–65  59.2 (0.9)  58.6 (1.2)  61.6 (1.5)   65+  20.8 (0.7)  19.9 (0.8)  24.6 (1.5)  Gender         Male  46.3 (0.7)  46.8 (0.8)  44.3 (1.7)   Female  53.7 (0.7)  53.2 (0.8)  55.7 (1.7)  Race         White  83.1 (0.9)  83.0 (1.0)  83.1 (1.1)   Non-white  16.9 (0.9)  17.0 (1.0)  16.9 (1.1)  Education group         Less than HS  13.5 (0.6)  13.4 (0.7)  14.0 (1.6)   HS graduate  24.7 (0.8)  24.5 (0.9)  25.3 (1.8)   Some college  21.4 (0.7)  21.2 (0.9)  22.2 (1.9)   Associates degree  10.2 (0.4)  10.0 (0.5)  10.9 (1.1)   Bachelor’s degree  19.1 (0.5)  19.4 (0.6)  17.9 (1.4)   Advance degree  11.1 (0.5)  11.5 (0.6)  9.7 (0.9)  Spanish language interview         Yes  4.1 (0.5)  3.7 (0.5)  5.4 (1.5)   No  95.9 (0.5)  96.3 (0.5)  94.6 (1.5)  Region**         Northeast  18.5 (0.6)  17.6 (0.7)  22.2 (1.7)   Midwest  23.5 (1.3)  24.5 (1.4)  19.2 (2.3)   South  35.0 (1.3)  35.9 (1.5)  31.5 (2.7)   West  23.0 (1.5)  22.0 (1.7)  27.1 (1.8)  Urbanicity         Urban  80.5 (0.9)  80.5 (1.2)  80.7 (2.0)   Rural  19.5 (0.9)  19.5 (1.2)  19.3 (2.0)  Characteristic  All respondents (n = 4,893)  Consenters (n = 3,951)  Non-consenters (n = 942)  Age group***         18–32  20.0 (0.9)  21.5 (1.1)  13.8 (1.0)   32–65  59.2 (0.9)  58.6 (1.2)  61.6 (1.5)   65+  20.8 (0.7)  19.9 (0.8)  24.6 (1.5)  Gender         Male  46.3 (0.7)  46.8 (0.8)  44.3 (1.7)   Female  53.7 (0.7)  53.2 (0.8)  55.7 (1.7)  Race         White  83.1 (0.9)  83.0 (1.0)  83.1 (1.1)   Non-white  16.9 (0.9)  17.0 (1.0)  16.9 (1.1)  Education group         Less than HS  13.5 (0.6)  13.4 (0.7)  14.0 (1.6)   HS graduate  24.7 (0.8)  24.5 (0.9)  25.3 (1.8)   Some college  21.4 (0.7)  21.2 (0.9)  22.2 (1.9)   Associates degree  10.2 (0.4)  10.0 (0.5)  10.9 (1.1)   Bachelor’s degree  19.1 (0.5)  19.4 (0.6)  17.9 (1.4)   Advance degree  11.1 (0.5)  11.5 (0.6)  9.7 (0.9)  Spanish language interview         Yes  4.1 (0.5)  3.7 (0.5)  5.4 (1.5)   No  95.9 (0.5)  96.3 (0.5)  94.6 (1.5)  Region**         Northeast  18.5 (0.6)  17.6 (0.7)  22.2 (1.7)   Midwest  23.5 (1.3)  24.5 (1.4)  19.2 (2.3)   South  35.0 (1.3)  35.9 (1.5)  31.5 (2.7)   West  23.0 (1.5)  22.0 (1.7)  27.1 (1.8)  Urbanicity         Urban  80.5 (0.9)  80.5 (1.2)  80.7 (2.0)   Rural  19.5 (0.9)  19.5 (1.2)  19.3 (2.0)  Note.—Standard errors shown in parentheses. Asterisks denote statistically significant differences. (** p < 0.01; ***p < .0001). Finally, table 3 presents results from two logistic regression models for the propensity to consent to linkage. Based on results from the exploratory analyses reported in the online Appendix D, model 1 includes several classes of predictors, including consumer-unit demographics; proxies for respondent attitudes; and several related two-factor interactions. Model 2 is identical to model 1, except for exclusion of low-income and imputed-income indicator variables. Consequently, model 2 should be considered for sensitivity analyses of consent-propensity-adjusted income estimators in section 4.2 below. Note especially that, relative to the corresponding standard errors, the estimated coefficients for models 1 and 2 are very similar. Table 3. Multivariate Logistic Models Predicting Consent-to-Link (Weighted) Variable  Model 1   Model 2   Estimate  SE  Estimate  SE  Demographic characteristics          Age group (32–65)           18–32  0.3435***  0.0633  0.3389***  0.0634   65+  −0.2692***  0.0645  −0.2532***  0.0656  Gender (Male)           Female  −0.0161  0.0426  −0.0229  0.0421  Race (White)           Non-white  0.0949  0.1264  0.0612  0.1260  Spanish interview (No)           Yes  −0.4234  0.2923  −0.4452  0.2790  Education group (HS grad)           Less than HS  0.3097  0.1879  0.2926  0.1835   Some college  0.4016**  0.1423  0.3739*  0.1389   Associate’s degree  −0.3321  0.2041  −0.3150  0.1993   Bachelor’s degree  −0.0654  0.2112  −0.0663  0.2082   Advance degree  −0.1747  0.2762  −0.1512  0.2773  Home owner (Renter)           Owner  −0.2767†  0.1453  −0.3233*  0.1436  Total expenditures (Log)  −0.0605  0.0799  −0.0083  0.0800  Income group           Less than $8,181  −0.2155***  0.0604      Income imputed (No)           Yes  −0.1487**  0.0513      Race × Gender  −0.1874†  0.0956  −0.1953*  0.0923  Owner × Education           Less than HS  −0.4931*  0.2066  −0.4632*  0.1996   Some college  −0.6575***  0.1567  −0.6370***  0.1613   Associate’s degree  0.3407†  0.2013  0.3402†  0.1986   Bachelor’s degree  0.1385  0.2402  0.1398  0.2368   Advance degree  0.4713  0.2847  0.4226  0.2870  Environmental features          Region (Northeast)           Midwest  0.2097†  0.1234  0.2111†  0.1163   South  0.2451*  0.1190  0.2408*  0.1187   West  −0.2670*  0.1055  −0.2557*  0.1066  Urbanicity (Rural)           Urban  −0.0268  0.0829  −0.0234  0.0824  R attitude proxies          Converted refusal (No)           Yes  −0.0721  0.0756  −0.0838  0.0763  Effort (Moderate)           A lot of effort  0.5454*  0.2081  0.6142**  0.2065   Bare minimum effort  −0.4879***  0.1365  −0.5721***  0.1371  Doorstep concerns (None)           Too busy  −0.2207  0.1485  −0.2137  0.1466   Privacy/gov’t concerns  −1.1895***  0.1667  −1.2241***  0.1622   Other  1.0547**  0.3261  1.0255**  0.3255  Doorstep concerns × effort           Privacy × a lot of effort  −0.5884*  0.2789  −0.5312†  0.2728   Privacy × minimum effort  0.3183  0.1918  0.2601  0.1924   Busy × a lot of effort  −0.0888  0.2736  −0.0890  0.2749   Busy × minimum effort  0.2238  0.2096  0.2277  0.2095   Other × a lot of effort  1.0794†  0.6224  1.0477  0.6182   Other × minimum effort  −0.9002*  0.3610  −0.8777*  0.3593  Variable  Model 1   Model 2   Estimate  SE  Estimate  SE  Demographic characteristics          Age group (32–65)           18–32  0.3435***  0.0633  0.3389***  0.0634   65+  −0.2692***  0.0645  −0.2532***  0.0656  Gender (Male)           Female  −0.0161  0.0426  −0.0229  0.0421  Race (White)           Non-white  0.0949  0.1264  0.0612  0.1260  Spanish interview (No)           Yes  −0.4234  0.2923  −0.4452  0.2790  Education group (HS grad)           Less than HS  0.3097  0.1879  0.2926  0.1835   Some college  0.4016**  0.1423  0.3739*  0.1389   Associate’s degree  −0.3321  0.2041  −0.3150  0.1993   Bachelor’s degree  −0.0654  0.2112  −0.0663  0.2082   Advance degree  −0.1747  0.2762  −0.1512  0.2773  Home owner (Renter)           Owner  −0.2767†  0.1453  −0.3233*  0.1436  Total expenditures (Log)  −0.0605  0.0799  −0.0083  0.0800  Income group           Less than $8,181  −0.2155***  0.0604      Income imputed (No)           Yes  −0.1487**  0.0513      Race × Gender  −0.1874†  0.0956  −0.1953*  0.0923  Owner × Education           Less than HS  −0.4931*  0.2066  −0.4632*  0.1996   Some college  −0.6575***  0.1567  −0.6370***  0.1613   Associate’s degree  0.3407†  0.2013  0.3402†  0.1986   Bachelor’s degree  0.1385  0.2402  0.1398  0.2368   Advance degree  0.4713  0.2847  0.4226  0.2870  Environmental features          Region (Northeast)           Midwest  0.2097†  0.1234  0.2111†  0.1163   South  0.2451*  0.1190  0.2408*  0.1187   West  −0.2670*  0.1055  −0.2557*  0.1066  Urbanicity (Rural)           Urban  −0.0268  0.0829  −0.0234  0.0824  R attitude proxies          Converted refusal (No)           Yes  −0.0721  0.0756  −0.0838  0.0763  Effort (Moderate)           A lot of effort  0.5454*  0.2081  0.6142**  0.2065   Bare minimum effort  −0.4879***  0.1365  −0.5721***  0.1371  Doorstep concerns (None)           Too busy  −0.2207  0.1485  −0.2137  0.1466   Privacy/gov’t concerns  −1.1895***  0.1667  −1.2241***  0.1622   Other  1.0547**  0.3261  1.0255**  0.3255  Doorstep concerns × effort           Privacy × a lot of effort  −0.5884*  0.2789  −0.5312†  0.2728   Privacy × minimum effort  0.3183  0.1918  0.2601  0.1924   Busy × a lot of effort  −0.0888  0.2736  −0.0890  0.2749   Busy × minimum effort  0.2238  0.2096  0.2277  0.2095   Other × a lot of effort  1.0794†  0.6224  1.0477  0.6182   Other × minimum effort  −0.9002*  0.3610  −0.8777*  0.3593  Note.—†p < 0.10, *p < 0.05, **p < 0.01, ***p < 0.001. Table 3. Multivariate Logistic Models Predicting Consent-to-Link (Weighted) Variable  Model 1   Model 2   Estimate  SE  Estimate  SE  Demographic characteristics          Age group (32–65)           18–32  0.3435***  0.0633  0.3389***  0.0634   65+  −0.2692***  0.0645  −0.2532***  0.0656  Gender (Male)           Female  −0.0161  0.0426  −0.0229  0.0421  Race (White)           Non-white  0.0949  0.1264  0.0612  0.1260  Spanish interview (No)           Yes  −0.4234  0.2923  −0.4452  0.2790  Education group (HS grad)           Less than HS  0.3097  0.1879  0.2926  0.1835   Some college  0.4016**  0.1423  0.3739*  0.1389   Associate’s degree  −0.3321  0.2041  −0.3150  0.1993   Bachelor’s degree  −0.0654  0.2112  −0.0663  0.2082   Advance degree  −0.1747  0.2762  −0.1512  0.2773  Home owner (Renter)           Owner  −0.2767†  0.1453  −0.3233*  0.1436  Total expenditures (Log)  −0.0605  0.0799  −0.0083  0.0800  Income group           Less than $8,181  −0.2155***  0.0604      Income imputed (No)           Yes  −0.1487**  0.0513      Race × Gender  −0.1874†  0.0956  −0.1953*  0.0923  Owner × Education           Less than HS  −0.4931*  0.2066  −0.4632*  0.1996   Some college  −0.6575***  0.1567  −0.6370***  0.1613   Associate’s degree  0.3407†  0.2013  0.3402†  0.1986   Bachelor’s degree  0.1385  0.2402  0.1398  0.2368   Advance degree  0.4713  0.2847  0.4226  0.2870  Environmental features          Region (Northeast)           Midwest  0.2097†  0.1234  0.2111†  0.1163   South  0.2451*  0.1190  0.2408*  0.1187   West  −0.2670*  0.1055  −0.2557*  0.1066  Urbanicity (Rural)           Urban  −0.0268  0.0829  −0.0234  0.0824  R attitude proxies          Converted refusal (No)           Yes  −0.0721  0.0756  −0.0838  0.0763  Effort (Moderate)           A lot of effort  0.5454*  0.2081  0.6142**  0.2065   Bare minimum effort  −0.4879***  0.1365  −0.5721***  0.1371  Doorstep concerns (None)           Too busy  −0.2207  0.1485  −0.2137  0.1466   Privacy/gov’t concerns  −1.1895***  0.1667  −1.2241***  0.1622   Other  1.0547**  0.3261  1.0255**  0.3255  Doorstep concerns × effort           Privacy × a lot of effort  −0.5884*  0.2789  −0.5312†  0.2728   Privacy × minimum effort  0.3183  0.1918  0.2601  0.1924   Busy × a lot of effort  −0.0888  0.2736  −0.0890  0.2749   Busy × minimum effort  0.2238  0.2096  0.2277  0.2095   Other × a lot of effort  1.0794†  0.6224  1.0477  0.6182   Other × minimum effort  −0.9002*  0.3610  −0.8777*  0.3593  Variable  Model 1   Model 2   Estimate  SE  Estimate  SE  Demographic characteristics          Age group (32–65)           18–32  0.3435***  0.0633  0.3389***  0.0634   65+  −0.2692***  0.0645  −0.2532***  0.0656  Gender (Male)           Female  −0.0161  0.0426  −0.0229  0.0421  Race (White)           Non-white  0.0949  0.1264  0.0612  0.1260  Spanish interview (No)           Yes  −0.4234  0.2923  −0.4452  0.2790  Education group (HS grad)           Less than HS  0.3097  0.1879  0.2926  0.1835   Some college  0.4016**  0.1423  0.3739*  0.1389   Associate’s degree  −0.3321  0.2041  −0.3150  0.1993   Bachelor’s degree  −0.0654  0.2112  −0.0663  0.2082   Advance degree  −0.1747  0.2762  −0.1512  0.2773  Home owner (Renter)           Owner  −0.2767†  0.1453  −0.3233*  0.1436  Total expenditures (Log)  −0.0605  0.0799  −0.0083  0.0800  Income group           Less than $8,181  −0.2155***  0.0604      Income imputed (No)           Yes  −0.1487**  0.0513      Race × Gender  −0.1874†  0.0956  −0.1953*  0.0923  Owner × Education           Less than HS  −0.4931*  0.2066  −0.4632*  0.1996   Some college  −0.6575***  0.1567  −0.6370***  0.1613   Associate’s degree  0.3407†  0.2013  0.3402†  0.1986   Bachelor’s degree  0.1385  0.2402  0.1398  0.2368   Advance degree  0.4713  0.2847  0.4226  0.2870  Environmental features          Region (Northeast)           Midwest  0.2097†  0.1234  0.2111†  0.1163   South  0.2451*  0.1190  0.2408*  0.1187   West  −0.2670*  0.1055  −0.2557*  0.1066  Urbanicity (Rural)           Urban  −0.0268  0.0829  −0.0234  0.0824  R attitude proxies          Converted refusal (No)           Yes  −0.0721  0.0756  −0.0838  0.0763  Effort (Moderate)           A lot of effort  0.5454*  0.2081  0.6142**  0.2065   Bare minimum effort  −0.4879***  0.1365  −0.5721***  0.1371  Doorstep concerns (None)           Too busy  −0.2207  0.1485  −0.2137  0.1466   Privacy/gov’t concerns  −1.1895***  0.1667  −1.2241***  0.1622   Other  1.0547**  0.3261  1.0255**  0.3255  Doorstep concerns × effort           Privacy × a lot of effort  −0.5884*  0.2789  −0.5312†  0.2728   Privacy × minimum effort  0.3183  0.1918  0.2601  0.1924   Busy × a lot of effort  −0.0888  0.2736  −0.0890  0.2749   Busy × minimum effort  0.2238  0.2096  0.2277  0.2095   Other × a lot of effort  1.0794†  0.6224  1.0477  0.6182   Other × minimum effort  −0.9002*  0.3610  −0.8777*  0.3593  Note.—†p < 0.10, *p < 0.05, **p < 0.01, ***p < 0.001. 4.2 Analysis of Economic Variables Subject to “Informed Consent” Access We examine consent bias for six CEQ variables for which there potentially is information available in administrative data sources that could be used to derive publishable estimates and obviate the need to ask respondents burdensome survey questions. For each of these variables, we computed three prospective estimators of the population mean Y-: FS: The full-sample mean (i.e., data from consenters and non-consenters), with weights equal to the standard complex design weight wi used for analyses of CE variables (a joint product of sample selection weight, last visit non-response weight, subsampling weight, and non-response weight). (Full Sample) CUNA: The weighted sample mean based only on the Y variables observed for the consenting units, using the same weights wi as employed for the FS estimator. (Consenting Units, No Adjustment) CUPA: A weighted sample mean based only on the Y variables observed for the consenting units, and using weights equal to wpi=wip^i . (Consenting Units, Propensity-Adjusted)2 CUPDA: A weighted sample mean based only on the Y variables observed for the consenting units, using weights equal to wpdecile=wip^decile . (Consenting Units, Propensity-Decile Adjusted, where p^decile is the unweighted propensity of “consent” units in the specified decile group) CUPQA: A weighted sample mean based only on the Y variables observed for the consenting units, using weights equal to wpquintile=wip^quintile . (Consenting Units, Propensity-Quintile Adjusted, where p^quinile is the unweighted propensity of “consent” units in the specified quintile group) For some general background on propensity-score subclassification methods developed for survey nonresponse analyses, see Little (1986), Yang, Lesser, Gitelman, and Birkes (2010, 2012) and references cited therein. Table 4 presents mean estimates for the six CEQ outcome variables under these five approaches (columns 2–6). The seventh column of the table examines the difference between mean estimates based on the full-sample (FS) and the unadjusted consenter group (CUNA). These results show fairly substantial differences between the two approaches for the mean estimates of income, property-tax, and rental value variables. This suggests it would be important to carry out adjustments to account for the approximately 20% of respondents who objected to linkage and were omitted from these consenter-based estimates. Table 4. Comparison of Three Estimation Methods   Full-sample (FS) mean (SE)  Consenting units, no adjustment (CUNA) mean (SE)  Consenting units, propensity-adjusted (CUPA) mean (SE)  Consenting units, propensity-decile adjusted (CUPDA) mean (SE)  Consenting units, propensity-quintile adjusted (CUPQA) mean (SE)  Point estimate differences   CUNA – FS (SEdiff)  CUPA – FS (SEdiff)  CUPDA – FS (SEdiff)  CUPQA – FS (SEdiff)  Column 1  2  3  4  5  6  7  8  9  10  Family income before taxes  $50,939.00  $52,869.00  $52,117.00  $52,269.00  $52,405.00  $1,930.00**  $1,178.00†  $1,330.00*  $1,466.00*  ($1,227.51)  ($1,535.04)  ($1,523.85)  ($1,520.74)  ($1,514.30)  ($580.98)  ($619.09)  ($602.64)  ($606.76)  Family income before taxes, with imputed values  $61,207.00  $61,405.00  $61,228.00  $61,337.00  $61,382.00  $198.00  $21.00  $130.00  $175.00  ($1,193.34)  ($1,510.55)  ($1,454.39)  ($1,463.34)  ($1,466.25)  ($573.46)  ($563.54)  ($557.10)  ($552.20)  Vehicle purchase cost  $599.59  $619.14  $607.80  $607.63  $611.75  $19.55†  $8.21  $8.04  $12.17  ($33.22)  ($37.05)  ($36.63)  ($36.67)  ($37.61)  ($10.83)  ($15.34)  ($15.23)  ($16.13)  Property taxes  $454.15  $429.12  $434.74  $434.88  $436.75  −$25.02***  −$19.41**  −$19.27**  −$17.40**  ($10.41)  ($10.76)  ($11.39)  ($11.37)  ($11.30)  ($6.08)  ($6.48)  ($6.05)  ($6.23)  Property value  $232,226.00  $227,734.00  $227,938.00  $227,935.00  $228,640.00  −$4,492.00*  −$4,288.00†  −$4291.00*  −$3586.00*  ($5,055.68)  ($5,730.38)  ($5,592.27)  ($5,532.64)  ($5,519.90)  ($2,118.39)  ($2,165.93)  ($2,132.40)  ($2,063.66)  Rental value  $1,296.52  $1,269.65  $1,272.40  $1,272.91  $1,274.67  −$26.87**  −$24.12**  −$23.61**  −$21.85**  ($15.58)  ($17.05)  ($17.81)  ($17.86)  ($17.68)  ($7.43)  ($8.18)  ($8.01)  ($8.02)    Full-sample (FS) mean (SE)  Consenting units, no adjustment (CUNA) mean (SE)  Consenting units, propensity-adjusted (CUPA) mean (SE)  Consenting units, propensity-decile adjusted (CUPDA) mean (SE)  Consenting units, propensity-quintile adjusted (CUPQA) mean (SE)  Point estimate differences   CUNA – FS (SEdiff)  CUPA – FS (SEdiff)  CUPDA – FS (SEdiff)  CUPQA – FS (SEdiff)  Column 1  2  3  4  5  6  7  8  9  10  Family income before taxes  $50,939.00  $52,869.00  $52,117.00  $52,269.00  $52,405.00  $1,930.00**  $1,178.00†  $1,330.00*  $1,466.00*  ($1,227.51)  ($1,535.04)  ($1,523.85)  ($1,520.74)  ($1,514.30)  ($580.98)  ($619.09)  ($602.64)  ($606.76)  Family income before taxes, with imputed values  $61,207.00  $61,405.00  $61,228.00  $61,337.00  $61,382.00  $198.00  $21.00  $130.00  $175.00  ($1,193.34)  ($1,510.55)  ($1,454.39)  ($1,463.34)  ($1,466.25)  ($573.46)  ($563.54)  ($557.10)  ($552.20)  Vehicle purchase cost  $599.59  $619.14  $607.80  $607.63  $611.75  $19.55†  $8.21  $8.04  $12.17  ($33.22)  ($37.05)  ($36.63)  ($36.67)  ($37.61)  ($10.83)  ($15.34)  ($15.23)  ($16.13)  Property taxes  $454.15  $429.12  $434.74  $434.88  $436.75  −$25.02***  −$19.41**  −$19.27**  −$17.40**  ($10.41)  ($10.76)  ($11.39)  ($11.37)  ($11.30)  ($6.08)  ($6.48)  ($6.05)  ($6.23)  Property value  $232,226.00  $227,734.00  $227,938.00  $227,935.00  $228,640.00  −$4,492.00*  −$4,288.00†  −$4291.00*  −$3586.00*  ($5,055.68)  ($5,730.38)  ($5,592.27)  ($5,532.64)  ($5,519.90)  ($2,118.39)  ($2,165.93)  ($2,132.40)  ($2,063.66)  Rental value  $1,296.52  $1,269.65  $1,272.40  $1,272.91  $1,274.67  −$26.87**  −$24.12**  −$23.61**  −$21.85**  ($15.58)  ($17.05)  ($17.81)  ($17.86)  ($17.68)  ($7.43)  ($8.18)  ($8.01)  ($8.02)  Note.—†p < 0.10, *p < 0.05, **p < 0.01 (bold), ***p < 0.001 (bold). Table 4. Comparison of Three Estimation Methods   Full-sample (FS) mean (SE)  Consenting units, no adjustment (CUNA) mean (SE)  Consenting units, propensity-adjusted (CUPA) mean (SE)  Consenting units, propensity-decile adjusted (CUPDA) mean (SE)  Consenting units, propensity-quintile adjusted (CUPQA) mean (SE)  Point estimate differences   CUNA – FS (SEdiff)  CUPA – FS (SEdiff)  CUPDA – FS (SEdiff)  CUPQA – FS (SEdiff)  Column 1  2  3  4  5  6  7  8  9  10  Family income before taxes  $50,939.00  $52,869.00  $52,117.00  $52,269.00  $52,405.00  $1,930.00**  $1,178.00†  $1,330.00*  $1,466.00*  ($1,227.51)  ($1,535.04)  ($1,523.85)  ($1,520.74)  ($1,514.30)  ($580.98)  ($619.09)  ($602.64)  ($606.76)  Family income before taxes, with imputed values  $61,207.00  $61,405.00  $61,228.00  $61,337.00  $61,382.00  $198.00  $21.00  $130.00  $175.00  ($1,193.34)  ($1,510.55)  ($1,454.39)  ($1,463.34)  ($1,466.25)  ($573.46)  ($563.54)  ($557.10)  ($552.20)  Vehicle purchase cost  $599.59  $619.14  $607.80  $607.63  $611.75  $19.55†  $8.21  $8.04  $12.17  ($33.22)  ($37.05)  ($36.63)  ($36.67)  ($37.61)  ($10.83)  ($15.34)  ($15.23)  ($16.13)  Property taxes  $454.15  $429.12  $434.74  $434.88  $436.75  −$25.02***  −$19.41**  −$19.27**  −$17.40**  ($10.41)  ($10.76)  ($11.39)  ($11.37)  ($11.30)  ($6.08)  ($6.48)  ($6.05)  ($6.23)  Property value  $232,226.00  $227,734.00  $227,938.00  $227,935.00  $228,640.00  −$4,492.00*  −$4,288.00†  −$4291.00*  −$3586.00*  ($5,055.68)  ($5,730.38)  ($5,592.27)  ($5,532.64)  ($5,519.90)  ($2,118.39)  ($2,165.93)  ($2,132.40)  ($2,063.66)  Rental value  $1,296.52  $1,269.65  $1,272.40  $1,272.91  $1,274.67  −$26.87**  −$24.12**  −$23.61**  −$21.85**  ($15.58)  ($17.05)  ($17.81)  ($17.86)  ($17.68)  ($7.43)  ($8.18)  ($8.01)  ($8.02)    Full-sample (FS) mean (SE)  Consenting units, no adjustment (CUNA) mean (SE)  Consenting units, propensity-adjusted (CUPA) mean (SE)  Consenting units, propensity-decile adjusted (CUPDA) mean (SE)  Consenting units, propensity-quintile adjusted (CUPQA) mean (SE)  Point estimate differences   CUNA – FS (SEdiff)  CUPA – FS (SEdiff)  CUPDA – FS (SEdiff)  CUPQA – FS (SEdiff)  Column 1  2  3  4  5  6  7  8  9  10  Family income before taxes  $50,939.00  $52,869.00  $52,117.00  $52,269.00  $52,405.00  $1,930.00**  $1,178.00†  $1,330.00*  $1,466.00*  ($1,227.51)  ($1,535.04)  ($1,523.85)  ($1,520.74)  ($1,514.30)  ($580.98)  ($619.09)  ($602.64)  ($606.76)  Family income before taxes, with imputed values  $61,207.00  $61,405.00  $61,228.00  $61,337.00  $61,382.00  $198.00  $21.00  $130.00  $175.00  ($1,193.34)  ($1,510.55)  ($1,454.39)  ($1,463.34)  ($1,466.25)  ($573.46)  ($563.54)  ($557.10)  ($552.20)  Vehicle purchase cost  $599.59  $619.14  $607.80  $607.63  $611.75  $19.55†  $8.21  $8.04  $12.17  ($33.22)  ($37.05)  ($36.63)  ($36.67)  ($37.61)  ($10.83)  ($15.34)  ($15.23)  ($16.13)  Property taxes  $454.15  $429.12  $434.74  $434.88  $436.75  −$25.02***  −$19.41**  −$19.27**  −$17.40**  ($10.41)  ($10.76)  ($11.39)  ($11.37)  ($11.30)  ($6.08)  ($6.48)  ($6.05)  ($6.23)  Property value  $232,226.00  $227,734.00  $227,938.00  $227,935.00  $228,640.00  −$4,492.00*  −$4,288.00†  −$4291.00*  −$3586.00*  ($5,055.68)  ($5,730.38)  ($5,592.27)  ($5,532.64)  ($5,519.90)  ($2,118.39)  ($2,165.93)  ($2,132.40)  ($2,063.66)  Rental value  $1,296.52  $1,269.65  $1,272.40  $1,272.91  $1,274.67  −$26.87**  −$24.12**  −$23.61**  −$21.85**  ($15.58)  ($17.05)  ($17.81)  ($17.86)  ($17.68)  ($7.43)  ($8.18)  ($8.01)  ($8.02)  Note.—†p < 0.10, *p < 0.05, **p < 0.01 (bold), ***p < 0.001 (bold). The eighth column of table 4 examines the impact of consent propensity weight adjustments on mean estimates, comparing the full-sample to the adjusted-consenter group. In general, the propensity adjustments improve estimates (i.e., moving them closer to the full-sample means). For example, the significant difference we observed in mean family income between the full-sample and the unadjusted consenters (column 5) is now reduced and only marginally significant. Nonetheless, it is evident that even after adjustment for the propensity to consent to link, there still are strong indications of bias for estimation of the property-tax, property-value, and rental-value variables. Consequently, we would need to study alternative approaches before considering this type of linkage in practice. Finally, the ninth and tenth columns of table 4 report related results for the differences CUPDA-FS and CUPQA-FS, respectively. Relative to the corresponding standard errors, the differences in these columns are qualitatively similar to those reported for CUPA-FS. Consequently, we do not see strong indications of sensitivity of results to the specific methods of propensity-based weighting adjustment employed. To complement the results reported in tables 3 and 4, it is useful to explore two related issues centered on broader distributional patterns. First, the logistic regression results in table 3 describe the complex pattern of main effects and interactions estimated for the model for the propensity of a consumer unit to consent to linkage. To provide a summary comparison of these results, figure 1 gives a quantile-quantile plot of the estimated consent-to-linkage propensity for, respectively, the “objection” subpopulation (horizontal axis) and the “consent” subpopulation (vertical axis). The 99 plotted points are for the 0.01 to 0.99 quantiles (with 0.01 increment). The diagonal reference line has a slope equal to 1 and an intercept equal to 0. If the plotted points all fell on that line, then the “object” and “consent” subpopulations would have essentially the same distributions of propensity-to-consent scores, and one would conclude that the propensity scores estimated from the specified model have provided essentially no practical discriminating power. Conversely, if all of the plotted points had a horizontal axis value close to 1 and a vertical axis value close to 0, then one would conclude that the propensity scores were providing very strong discriminating power. For the propensity scores produced from model 2, note especially that for quantiles below the median, the values for the “objection” and “consent” subpopulations are relatively close, indicating that the lower values of the propensity scores provide relatively little discriminating power. Larger propensity score values (e.g., above the 0.70 quantile) show somewhat stronger discrimination. Table 5 elaborates on this result by presenting the estimated seventieth, eightieth, ninetieth, and ninety-ninenth percentiles of the propensity-score values from the “consent” and “object” subpopulations. Table 5. Selected Percentiles of the Estimated Propensity to Consent for the “Consent” and “Object” Subpopulations Percentile (%)  Estimated propensity of consent   Consent subpopulation  Object subpopulation  70  0.84  0.88  80  0.86  0.89  90  0.88  0.92  99  0.93  0.96  Percentile (%)  Estimated propensity of consent   Consent subpopulation  Object subpopulation  70  0.84  0.88  80  0.86  0.89  90  0.88  0.92  99  0.93  0.96  Table 5. Selected Percentiles of the Estimated Propensity to Consent for the “Consent” and “Object” Subpopulations Percentile (%)  Estimated propensity of consent   Consent subpopulation  Object subpopulation  70  0.84  0.88  80  0.86  0.89  90  0.88  0.92  99  0.93  0.96  Percentile (%)  Estimated propensity of consent   Consent subpopulation  Object subpopulation  70  0.84  0.88  80  0.86  0.89  90  0.88  0.92  99  0.93  0.96  Figure 1. View largeDownload slide Plot of the Estimated Quantiles of P(Consent) for the “Consent” Subpopulation (Vertical Axis) and “Object” Subpopulation (Horizontal Axis). Reported estimates for the 0.01 to 0.99 quantiles (with 0.01 increment). Diagonal reference line has slope = 1 and intercept = 0. Figure 1. View largeDownload slide Plot of the Estimated Quantiles of P(Consent) for the “Consent” Subpopulation (Vertical Axis) and “Object” Subpopulation (Horizontal Axis). Reported estimates for the 0.01 to 0.99 quantiles (with 0.01 increment). Diagonal reference line has slope = 1 and intercept = 0. Second, the numerical results in table 4 identified substantial differences in the means of the consenting units (after propensity adjustment) and the full sample for the variables defined by unadjusted income, property tax, property value, and rental value. Consequently, it is of interest to explore the extent to which the reported mean differences are associated primarily with strong differences between distributional tails or with broader-based differences between the respective distributions. To explore this, figures 2–4 present plots of specified functions of the estimated distributions of the underlying populations of the unadjusted income variable (FINCBTAX). Figure 2. View largeDownload slide Quantile Estimates and Associated Pointwise 95% Confidence Bounds for Before-Tax Family Income Based on the Sample from the Full Population. Figure 2. View largeDownload slide Quantile Estimates and Associated Pointwise 95% Confidence Bounds for Before-Tax Family Income Based on the Sample from the Full Population. Figure 3. View largeDownload slide Boxplots of Values of Before-Tax Family Income, Separately for the Ten Groups Defined by Deciles of P(Consent) as Estimated by Model 2 (Table 3). Group labels on the horizontal axis equal the midpoints of the respective decile groups. Figure 3. View largeDownload slide Boxplots of Values of Before-Tax Family Income, Separately for the Ten Groups Defined by Deciles of P(Consent) as Estimated by Model 2 (Table 3). Group labels on the horizontal axis equal the midpoints of the respective decile groups. Figure 4. View largeDownload slide Differences Between Estimated Quantiles for Before-Tax Family Income Based on the “Consent” Subpopulation with Propensity Score Adjustment and the Full Population, Respectively. Reported estimates for the 0.01 to 0.99 quantiles (with 0.01 increment), and associated pointwise 95 percent confidence bounds. Figure 4. View largeDownload slide Differences Between Estimated Quantiles for Before-Tax Family Income Based on the “Consent” Subpopulation with Propensity Score Adjustment and the Full Population, Respectively. Reported estimates for the 0.01 to 0.99 quantiles (with 0.01 increment), and associated pointwise 95 percent confidence bounds. Figure 2 presents a plot of the 0.01 to 0.99 quantiles (with 0.01 increment) of income, accompanied by pointwise 95% confidence intervals for the full sample, based on a standard survey-weighted estimator of the corresponding distribution function. The curvature pattern is consistent with a heavily right-skewed distribution, as one generally would expect for an income variable. Figure 3 presents side-by-side boxplots of the values of the unadjusted income variable (FINCBTAX), separately for each of the ten subpopulations defined by the deciles of the P (Consent) propensity values. With the exception of one extreme positive outlier in the seventh decile group, the ten boxplots are similar and, thus, do not provide strong evidence of differences in income distributions across the ten propensity groups. To explore this further, for each value q = 0.01 to 0.99 (with 0.01 increment in quantiles), figure 4 displays the differences between the q-th quantiles of estimated income for the “consent” subpopulation (after propensity score adjustment) and the full-sample-based estimate. Accompanying each difference is a pointwise 95% confidence interval. Again here, the plot does not display strong trends across quantile groups as the value of q increases. Consequently, the mean differences noted for income in table 4 cannot be attributed to differences in one tail of the income distribution. Also, note that the widths of the pointwise confidence intervals for the quantile differences increase substantially as q increases. This phenomenon arises commonly in quantile analyses for right-skewed distributions and results from the declining density of observations in the right tail of the distribution. In quantile plots not included here, qualitatively similar patterns of centrality and dispersion were observed for the property tax, property value, and rental value variables. 5. DISCUSSION 5.1 Summary of Results As noted in the introduction, legal and social environments often require that sampled survey units to provide informed consent before a survey organization may link their responses with administrative or commercial records. For these cases, survey organizations must assess a complex range of factors, including (a) the general willingness of a respondent to consent to linkage; (b) the probability of successful linkage with a given record source, conditional on consent in (a); (c) the quality of the linked source; and (d) the impact of (a)–(c) on the properties of the resulting estimators that combine survey and linked-source data. Rigorous assessment of (b), (c), and (d) in a production environment can be quite expensive and time-consuming, which in turn appears to have limited the extent and pace of exploration of record linkage to supplement standard sample survey data collection. To address this problem, the current paper has considered an approach based on inclusion of a simple “consent-to-link” question in a standard survey instrument, followed by analyses to address issue (a). The resulting models for the propensity to object to linkage identified significant factors in standard demographic characteristics, proxy variables related to respondent attitudes, and related two-factor interactions. In addition, follow-up analyses of several economic variables (directly reported income, property tax, property value, and rental value) identified substantial differences between, respectively, the full population and the propensity-adjusted means of the consenting subpopulation. Further analyses of the estimated quantiles of these economic variables did not indicate that these mean differences were attributable to simple tail-quantile pheonomena. 5.2 Prospective Extensions The conceptual development and numerical results considered in this paper could be extended in several directions. Of special interest would be use of alternative propensity-based weighting methods; joint modeling of consent-to-link and item response probabilities; approximate optimization of design decisions related to potentially burdensome questions; and efficient integration of test procedures into production-oriented settings. 5.2.1 Joint modeling of consent-to-link and item-missingness propensities It would be useful to extend the analyses in table 4 to cover a wide range of survey variables with varying rates of item missingness. This would allow exploration of issues identified in previous papers that found consent-to-linkage rates were significantly lower for respondents who had higher levels of item nonresponse on previous interview waves, particularly refusals on income or wealth questions (Jenkins et al. 2006; Olson 1999; Woolf et al. 2000; Bates 2005; Dahlhamer and Cox 2007; Pascale 2011). Also, missing survey items and refusal to consent to record linkage may both be associated with the same underlying unit attributes, e.g., lack of trust or lack of interest in the survey topic. For those cases, versions of the table 4 analyses would require in-depth analyses of the joint propensity of a sample unit to provide consent to linkage and to provide a response for a given set of items. 5.2.2 Design optimization: linkage and assignment of potentially burdensome questions In addition, as noted in the introduction, statistical agencies are interested in increasing the use of record linkage in ways that may reduce data collection costs and respondent burden. To implement this idea, it would be of interest to explore the following: trade-offs in approximate measures of cost and burden that may arise through integrated use of direct collection of low-burden items from all sample units; direct collection of higher-burden items from some units, with the selection of those units determined through subsampling of the “consent” and “refusal” cases at different rates, while accounting for the estimated probability that a given unit will have item nonresponse for this item; and use of administrative records at the unit level for some or all of the consenting units. The resulting estimation and inference methods would be based on integration of the abovementioned data sources based on modeling of the propensity to consent; propensity to obtain a successful link, conditional on consent; and possibly the use of aggregates from the administrative record variable for all units to provide calibration weights. Within this framework, efforts to produce approximate design optimization center on determination of subsampling probabilities conditional on observable X variables. This would require estimates of the joint propensities to provide consent-to-link and to provide survey item responses. Specifics of the optimization process would depend on the inferential goals, e.g., (a) minimizing the variance for mean estimators for specified variables like income or expenditures; and (b) minimizing selected measures of cost or burden of special importance to the statistical organization. 5.2.3 Analysis of consent decisions linked with the survey lifecycle Finally, as noted in section 1.1, efficient integration of test procedures into production-oriented settings generally involves trade-offs among relevance, resource constraints, and potential confounding effects. To illustrate, a primary goal of the current study was to assess record-linkage options for reduction of respondent burden, and we sought to identify factors associated with linkage consent, within the production environment, with little or no disruption of current production work. This naturally led to limitations of the study. For example, the current study has consent-to-link information only for respondents who completed the fifth and final wave of CEQ. This raises possible confounding of consent effects with wave-level attrition and general cooperation effects. Thus, it would be of interest to extend the current study by measuring respondent consent-to-link decisions and modeling the resulting propensities at different stages of the survey lifecycle. Supplementary Materials Supplementary materials are available online at http://www.oxfordjournals.org/our_journals/jssam/. The authors thank Steve Henderson, Brandon Kopp, and Jay Ryan for many helpful discussions of the Consumer Expenditure Survey data; and Joe Sakshaug for valuable comments on an earlier version of this paper which was presented at the 2015 Joint Statistical Meetings. Initial design and placement of the “consent-to-link” question analyzed in this paper was carried out by Davis et al. (2013). The views expressed in this paper are those of the authors and do not necessarily reflect the policies of the U.S. Bureau of Labor Statistics or the U.S. Census Bureau. Footnotes 1 In January 2015, the CEQ dropped its initial bounding interview and currently administers only four quarterly interviews. The data used in our analyses were collected prior to this design change. However, the fundamental issues addressed in this paper remain relevant under the new CEQ survey design. 2 Since one of the key outcome variables in our bias analyses is family income before taxes, in these analyses we remove the family income group variable and imputation indicator from model 1 to estimate the survey-weighted propensity of consent-to-link p̂i for all sample units i. To be consistent, we used the same propensity model for all prospective economic variables in these analyses (model 2 in table 3). REFERENCES Al Baghal T., Knies G., Burton J. ( 2014), “Linking Administrative Records to Surveys: Differences in the Correlates to Consent Decisions.” Understanding Society Working Paper Series; Institute for Social and Economic Research: No. 2014-09. Archer K. J., Lemeshow S. ( 2006), “Goodness-of-Fit Test For a Logistic Regression Model Fitted Using Survey Sample Data,” The Stata Journal , 6, 97– 105. Archer K. J., Lemeshow S., Hosmer D. W. ( 2007), “Goodness-of-Fit Tests For Logistic Regression Models When Data Are Collected Using a Complex Sampling Design,” Computational Statistics and Data Analysis , 51, 4450– 4464. Google Scholar CrossRef Search ADS   Bates N. ( 2005), “Development and Testing of Informed Consent Questions to Link Survey Data with Administrative Records,” Proceedings of the AAPOR-ASA Section on Survey Methods Research, pp. 3786–3792, Washington, DC: American Statistical Association. Bates N., Wroblewski M., Pascale J. ( 2012), “Public Attitudes Toward the Use of Administrative Records in the U.S. Census: Does Question Frame Matter.” Center for Survey Measurement, US Census Bureau; Washington, DC: Survey Methodology Study Series #2012-04. Beebe T., Ziegenfuss J., St Sauver J., Jenkins S., Haas L., Davern M., Tally J. ( 2011), “HIPAA Authorization and Survey Nonresponse Bias,” Med Care , 49, 365– 370. Google Scholar CrossRef Search ADS PubMed  Couper M., Singer E., Conrad F., Groves R. ( 2008), “Risk of Disclosure, Perceptions of Risk, and Concerns About Privacy and Confidentiality as Factors in Survey Participation,” Journal of Official Statistics , 24, 255– 275. Google Scholar PubMed  Dahlhamer J., Cox S. C. ( 2007), “Respondent Consent to Link Survey data with Administrative Records: Results from a Split-Ballot Field Test with the 2007 National Health Interview Survey.” Proceedings of the Federal Committee on Statistical Methodology Research Meeting; Washington, DC. Available at: https://s3.amazonaws.com/sitesusa/wp-content/uploads/sites/242/2014/05/2007FCSM_Dahlhamer-IV-B.pdf, last accessed December 22, 2016. Das M., Couper M. P. ( 2014), “Optimizing Opt-Out Consent for Record Linkage,” Journal of Official Statstics , 30, 479– 497. Davis J., Elkin I., McBride B., To N. ( 2013), “2011 Research Section Analysis.” Technical Report, Division of Consumer Expenditure Surveys, U.S. Bureau of Labor Statistics. Drolet A., Morris M. ( 2000), “Rapport in Conflict Resolution: Accounting For How Face-to-Face Contact Fosters Mutual Cooperation in Mixed-Motive Conflicts,” Journal of Experimental Social Psychology , 36, 26– 50. Google Scholar CrossRef Search ADS   Dunn K., Jordan K., Lacey R., Shapley M., Jinks C. ( 2004), “Patterns of Consent in Epidemiologic Research: Evidence from over 25, 000 Responders,” American Journal of Epidemiology , 159, 1087– 1094. Google Scholar CrossRef Search ADS PubMed  Fulton J. ( 2012), “Respondent Consent to Use Administrative Data,” Ph.D. dissertation, University of Maryland, Joint Program in Survey Methodology, College Park, MD. Goyder J. ( 1986), “Surveys on Surveys: Limitations and Potentialities,” Public Opinion Quarterly , 50, 27– 41. Google Scholar CrossRef Search ADS   Groves R., Cialdini R., Couper M. ( 1992), “Understanding the Decision to Participate in a Survey,” Public Opinion Quarterly , 56, 475– 495. Google Scholar CrossRef Search ADS   Groves R., Couper M. ( 1998), Nonresponse in Household Interview Surveys , New York: Wiley. Google Scholar CrossRef Search ADS   Groves R., Dillman D., Eltinge J., Little R. ( 2002), Survey Nonresponse , New York: Wiley. Groves R., Singer E., Corning A. ( 2000), “Leverage-Salience Theory of Survey Participation: Description and an Illustration,” Public Opinion Quarterly , 64, 299– 308. Google Scholar CrossRef Search ADS PubMed  Herzog T., Scheuren F., Winkler W. ( 2007), Data Quality and Record Linkage Techniques , New York: Springer. Holbrook A., Green M., Krosnick J. ( 2003), “Telephone vs. Face-to-Face Interviewing of National Probability Samples with Long Questionnaires: Comparisons of Respondent Satisficing and Social Desirability Response Bias,” Public Opinion Quarterly , 67, 79– 125. Google Scholar CrossRef Search ADS   Huang N., Shih S., Chang H., Chou Y. ( 2007), “Record Linkage Research and Informed Consent: Who Consents?,” BMC Health Services Research , 7, 18– 23. Google Scholar CrossRef Search ADS PubMed  Jenkins S., Cappellari L., Lynn P., Jackle A., Sala E. ( 2006), “Patterns of Consent: Evidence from a General Household Survey,” Journal of the Royal Statistical Society, Series A , 169, 701– 722. Google Scholar CrossRef Search ADS   Judkins D. R. ( 1990), “Fay’s Method for Variance Estimation,” Journal of Official Statistics , 6, 223– 239. Kho M., Duggett M., Willison D., Cook D., Brouwers M. ( 2009), “Written Informed Consent and Selection Bias in Observational Studies Using Medical Records: Systematic Review,” British Medical Journal , 338: b866. Google Scholar CrossRef Search ADS PubMed  Kish L. ( 1965), Survey Sampling , New York: Wiley. Knies G., Burton J., Sala E. ( 2012), “Consenting to Health-Record Linkage: Evidence from a Multi-Purpose Longitudinal Survey of a General Population,” BMC Health Services Research , 12, 1– 6. Google Scholar CrossRef Search ADS PubMed  Korn E. L., Graubard B. I. ( 1990), “Simultaneous Testing of Regression Coefficients with Complex Survey Data: Use of Bonferroni t Statistics,” The American Statistician , 44, 270– 276. Kreuter F., Sakshaug J. W., Tourangeau R. ( 2016), “The Framing of the Record Linkage Consent Question,” International Journal of Public Opinion Research , 28, 142– 152 Google Scholar CrossRef Search ADS   Krobmacher J., Schroeder M. ( 2013), “Consent When Linking Survey Data with Administrative Records: The Role of the Interviewer,” Survey Research Methods , 7, 115– 131. Little R. ( 1986), “Survey Nonresponse Adjustments for Estimates of Means,” International Statistical Review , 54, 139– 157. Google Scholar CrossRef Search ADS   Mattis J. S., Hammond W. P., Grayman N., Bonacci M., Brennan W., Cowie S-A., Ladyzhenskaya L., So S. ( 2009), “The Social Production of Altruism: Motivations for Caring Action in a Low-Income Urban Community,” American Journal of Community Psychology , 43, 71– 84. Google Scholar CrossRef Search ADS PubMed  McNabb J., Timmons D., Song J., Puckett C. ( 2009), “Uses of Administrative Data at the Social Security Administration,” Social Security Bulletin , 69, 75– 84. Google Scholar PubMed  Mostafa T. ( 2015), “Variation Within Households in Consent to Link Survey Data to Administrative Records: Evidence from the UK Millennium Cohort Study,” International Journal of Social Research Methodology , 19, 355– 375. Google Scholar CrossRef Search ADS   Olson J. ( 1999), “Linkages with Data from Social Security Administrative Records in the Health and Retirement Study,” Social Security Bulletin , 62, 73– 85. O’Muircheartaigh C., Campanelli P. ( 1999), “A Multilevel Exploration of the Role of Interviewers in Survey Non-Response,” Journal of the Royal Statistical Society, Series A , 162, 437– 446. Google Scholar CrossRef Search ADS   Pascale J. ( 2011), “Requesting Consent to Link Survey Data to Administrative Records: Results from a Split-Ballot Experiment in the Survey of Health Insurance and Program Participation (SHIPP).” Center for Survey Measurement, Research and Methodology Directorate, US Census Bureau; Washington, DC: Survey Methodology Study Series #2011-03. Putnam R. ( 1995), “Bowling Alone: America’s Declining Social Capital,” Journal of Democracy , 6, 65– 78. Google Scholar CrossRef Search ADS   Rosenbaum P. R., Rubin D. B. ( 1983), “The Central Role of the Propensity Score in Observational Studies for Causal Effects,” Biometrika , 70, 41– 55. Google Scholar CrossRef Search ADS   Sabelhaus J., Johnson D., Ash S., Swanson D., Garner T., Greenlees J., Henderson S. ( 2014), "Is the Consumer Expenditure Survey representative by income?" Improving the Measurement of Consumer Expenditures, National Bureau of Economic Research, Studies in Income and Wealth , Chicago: University of Chicago Press. Sakshaug J., Couper M., Ofstedal M. ( 2010), “Characteristics of Physical Measurement Consent in a Population-Based Survey of Older Adults,” Medical Care , 48, 64– 71. Google Scholar CrossRef Search ADS PubMed  Sakshaug J., Couper M., Ofstedal M., Weir D. ( 2012), “Linking Survey and Administrative Data: Mechanisms of Consent,” Sociological Methods and Research , 41, 535– 569. Google Scholar CrossRef Search ADS PubMed  Sakshaug J. W., Huber M. ( 2016), “An Evaluation of Panel Nonresponse and Linkage Consent Bias in a Survey of Employees in Germany,” Journal of Survey Statistics and Methodology , 4, 71– 93. Google Scholar CrossRef Search ADS   Sakshaug J., Kreuter F. ( 2012), “Assessing the Magnitude of Non-Consent Biases in Linked Survey and Administrative Data,” Survey Research Methods , 6, 113– 22. Sakshaug J., Kreuter F. ( 2014), “The Effect of Benefit Wording on Consent to Link Survey and Administrative Records in a Web Survey,” Public Opinion Quarterly , 78, 166– 176. Google Scholar CrossRef Search ADS   Sakshaug J., Tutz V., Kreuter F. ( 2013), “Placement, Wording and Interviewers: Identifying Correlates of Consent to Link Survey and Administrative Data,” Survey Research Methods , 7, 33– 144. Sala E., Burton J., Knies G. ( 2012), “Correlates of Obtaining Informed Consent to Data Linkage: Respondent, Interview, and Interviewer Characteristics,” Sociological Methods and Research , 41, 414– 439. Google Scholar CrossRef Search ADS   Sala E., Knies G., Burton J. ( 2014), “Propensity to Consent to Data Linkage: Experimental Evidence on the Role of Three Survey Design Features in a UK Longitudinal Panel,” International Journal of Social Research Methodology , 17, 455– 473. Google Scholar CrossRef Search ADS   Singer E. ( 1993), “Informed Consent and Survey Response: A Summary of the Empirical Literature,” Journal of Official Statistics , 9, 361– 375. Singer E. ( 2003), “Exploring the Meaning of Consent: Participation in Research and Beliefs About Risks and Benefits,” Journal of Official Statistics , 19, 273– 285. Singer E., Bates N., Van Hoewyk J. ( 2011), “Concerns About Privacy, Trust in Government, and Willingness to Use Administrative Records to Improve the Decennial Census,” Proceedings of American Statistical Association Section of the Survey Research Methods. Tan L. ( 2011), “An Introduction to the Contact History Instrument (CHI) for the Consumer Expenditure Survey,” Consumer Expenditure Survey Anthology , 2011, 8– 16. http://dx.doi.org/10.1299/jsmehs.2011.48.105 U.S. Department of Labor [Bureau of Labor Statistics] ( 2014), “Consumer Expenditures and Income,” in Handbook of Methods , Chapter 16. Washington, DC, USA. http://www.bls.gov/opub/hom/pdf/homch16.pdf, last accessed December 19, 2016. West B., Kreuter F., Jaenichen U. ( 2013), “Interviewer’ Effects in Face-to-Face Surveys: A Function of Sampling, Measurement Error, or Nonresponse?,” Journal of Official Statistics , 29, 277– 297. Google Scholar CrossRef Search ADS   Woolf S., Rothemich S., Johnson R., Marsland D. ( 2000), “Selection Bias from Requiring Patients to Give Consent to Examine Data for Health Services Research,” Archives of Family Medicine , 9, 1111– 1118. Google Scholar CrossRef Search ADS PubMed  Yang D., Lesser V. M., Gitelman A. I., Birkes D. S. ( 2010), Improving the Propensity Score Equal Frequency Adjustment Estimator Using an Alternative Weight, paper presented at the American Statistical Association (ASA) Joint Statistical Meetings (JSM), Vancouver, British Columbia, August, 2010. Yang D., Lesser V. M., Gitelman A. I., Birkes D. S. ( 2012), Propensity Score Adjustments for Covariates in Observational Studies, paper presented at the American Statistical Association (ASA) Joint Statistical Meetings (JSM), San Diego, California, August, 2012. Appendix A: Variable descriptions A.1 Predictor Variables Examined Respondent sociodemographics  Variable description   Age  1 if age 18 to 32, 2 if 32 to 65 (reference group), 3 if older than 65   Gender  1 if male (reference group), 2 if female   Race  0 if white (reference group), 1 if non-white   Education  0 if less than high school, 1 if high school graduate (reference group), 2 if some college, 3 if Associates degree, 4 if Bachelor’s degree, and 5 if more than Bachelor’s   Language  1 if interview language was Spanish, 0 if English (reference group)     Household size  1 if single-person HH, 2 if 2-person HH (reference group), 3 if 3- or 4-person HH, 4 if more than 4-person HH   Household type  0 if renter (reference group), 1 if owner   Family income  1 if total family income before taxes is less than or equal to $8,180, 2 if it exceeds $8,180   Total spending (log)  Log of total expenditures reported for all major expenditure categories  Survey features     Mode  1 if telephone, 0 if personal visit (reference group)  Area characteristics     Region  1 if Northeast (reference group), 2 if Midwest, 3 if South, 4 if West   Urban status  1 if urban, 2 if rural (reference group)  Respondent attitude and effort     Converted refusal status  1 if ever a converted refusal during previous interviews, 2 if not (reference group)   Interview duration  1 if total interview time was less than 52 minutes, 2 if greater than 52 minutes (reference group)   Cooperation  1 if “very cooperative,” 2 if “somewhat cooperative,” 3 if “neither cooperative nor uncooperative,” 4 if somewhat uncooperative,” 5 if “very uncooperative”   Effort  1 if “a lot of effort,” 2 if “a moderate amount” (reference group), 3 if “a bare minimum”   Effort change  1 if effort increased during interview, 2 if decreased, 3 if stayed the same (reference group), 4 if not sure   Use of information booklet  1 if respondent used booklet “almost always,” 2 if “most of the time,” 3 if occasionally,” 4 if “never or almost never,” and 5 if did not have access to booklet (reference group)   Doorstep concerns  0 if no concerns (reference group), 1 if privacy/anti-government concerns, 2 if too busy or logistical concerns, 3 if “other”  Respondent sociodemographics  Variable description   Age  1 if age 18 to 32, 2 if 32 to 65 (reference group), 3 if older than 65   Gender  1 if male (reference group), 2 if female   Race  0 if white (reference group), 1 if non-white   Education  0 if less than high school, 1 if high school graduate (reference group), 2 if some college, 3 if Associates degree, 4 if Bachelor’s degree, and 5 if more than Bachelor’s   Language  1 if interview language was Spanish, 0 if English (reference group)     Household size  1 if single-person HH, 2 if 2-person HH (reference group), 3 if 3- or 4-person HH, 4 if more than 4-person HH   Household type  0 if renter (reference group), 1 if owner   Family income  1 if total family income before taxes is less than or equal to $8,180, 2 if it exceeds $8,180   Total spending (log)  Log of total expenditures reported for all major expenditure categories  Survey features     Mode  1 if telephone, 0 if personal visit (reference group)  Area characteristics     Region  1 if Northeast (reference group), 2 if Midwest, 3 if South, 4 if West   Urban status  1 if urban, 2 if rural (reference group)  Respondent attitude and effort     Converted refusal status  1 if ever a converted refusal during previous interviews, 2 if not (reference group)   Interview duration  1 if total interview time was less than 52 minutes, 2 if greater than 52 minutes (reference group)   Cooperation  1 if “very cooperative,” 2 if “somewhat cooperative,” 3 if “neither cooperative nor uncooperative,” 4 if somewhat uncooperative,” 5 if “very uncooperative”   Effort  1 if “a lot of effort,” 2 if “a moderate amount” (reference group), 3 if “a bare minimum”   Effort change  1 if effort increased during interview, 2 if decreased, 3 if stayed the same (reference group), 4 if not sure   Use of information booklet  1 if respondent used booklet “almost always,” 2 if “most of the time,” 3 if occasionally,” 4 if “never or almost never,” and 5 if did not have access to booklet (reference group)   Doorstep concerns  0 if no concerns (reference group), 1 if privacy/anti-government concerns, 2 if too busy or logistical concerns, 3 if “other”  Respondent sociodemographics  Variable description   Age  1 if age 18 to 32, 2 if 32 to 65 (reference group), 3 if older than 65   Gender  1 if male (reference group), 2 if female   Race  0 if white (reference group), 1 if non-white   Education  0 if less than high school, 1 if high school graduate (reference group), 2 if some college, 3 if Associates degree, 4 if Bachelor’s degree, and 5 if more than Bachelor’s   Language  1 if interview language was Spanish, 0 if English (reference group)     Household size  1 if single-person HH, 2 if 2-person HH (reference group), 3 if 3- or 4-person HH, 4 if more than 4-person HH   Household type  0 if renter (reference group), 1 if owner   Family income  1 if total family income before taxes is less than or equal to $8,180, 2 if it exceeds $8,180   Total spending (log)  Log of total expenditures reported for all major expenditure categories  Survey features     Mode  1 if telephone, 0 if personal visit (reference group)  Area characteristics     Region  1 if Northeast (reference group), 2 if Midwest, 3 if South, 4 if West   Urban status  1 if urban, 2 if rural (reference group)  Respondent attitude and effort     Converted refusal status  1 if ever a converted refusal during previous interviews, 2 if not (reference group)   Interview duration  1 if total interview time was less than 52 minutes, 2 if greater than 52 minutes (reference group)   Cooperation  1 if “very cooperative,” 2 if “somewhat cooperative,” 3 if “neither cooperative nor uncooperative,” 4 if somewhat uncooperative,” 5 if “very uncooperative”   Effort  1 if “a lot of effort,” 2 if “a moderate amount” (reference group), 3 if “a bare minimum”   Effort change  1 if effort increased during interview, 2 if decreased, 3 if stayed the same (reference group), 4 if not sure   Use of information booklet  1 if respondent used booklet “almost always,” 2 if “most of the time,” 3 if occasionally,” 4 if “never or almost never,” and 5 if did not have access to booklet (reference group)   Doorstep concerns  0 if no concerns (reference group), 1 if privacy/anti-government concerns, 2 if too busy or logistical concerns, 3 if “other”  Respondent sociodemographics  Variable description   Age  1 if age 18 to 32, 2 if 32 to 65 (reference group), 3 if older than 65   Gender  1 if male (reference group), 2 if female   Race  0 if white (reference group), 1 if non-white   Education  0 if less than high school, 1 if high school graduate (reference group), 2 if some college, 3 if Associates degree, 4 if Bachelor’s degree, and 5 if more than Bachelor’s   Language  1 if interview language was Spanish, 0 if English (reference group)     Household size  1 if single-person HH, 2 if 2-person HH (reference group), 3 if 3- or 4-person HH, 4 if more than 4-person HH   Household type  0 if renter (reference group), 1 if owner   Family income  1 if total family income before taxes is less than or equal to $8,180, 2 if it exceeds $8,180   Total spending (log)  Log of total expenditures reported for all major expenditure categories  Survey features     Mode  1 if telephone, 0 if personal visit (reference group)  Area characteristics     Region  1 if Northeast (reference group), 2 if Midwest, 3 if South, 4 if West   Urban status  1 if urban, 2 if rural (reference group)  Respondent attitude and effort     Converted refusal status  1 if ever a converted refusal during previous interviews, 2 if not (reference group)   Interview duration  1 if total interview time was less than 52 minutes, 2 if greater than 52 minutes (reference group)   Cooperation  1 if “very cooperative,” 2 if “somewhat cooperative,” 3 if “neither cooperative nor uncooperative,” 4 if somewhat uncooperative,” 5 if “very uncooperative”   Effort  1 if “a lot of effort,” 2 if “a moderate amount” (reference group), 3 if “a bare minimum”   Effort change  1 if effort increased during interview, 2 if decreased, 3 if stayed the same (reference group), 4 if not sure   Use of information booklet  1 if respondent used booklet “almost always,” 2 if “most of the time,” 3 if occasionally,” 4 if “never or almost never,” and 5 if did not have access to booklet (reference group)   Doorstep concerns  0 if no concerns (reference group), 1 if privacy/anti-government concerns, 2 if too busy or logistical concerns, 3 if “other”  A.2 Interviewer-Captured Respondent Doorstep Concerns and Their Assigned Concern Category Respondent “doorstep” concerns  Assigned category  Privacy concerns  Privacy/government concerns  Anti-government concerns  Too busy  Too busy/logistical concerns  Interview takes too much time  Breaks appointments (puts off FR indefinitely)  Not interested/Does not want to be bothered  Too many interviews  Scheduling difficulties  Family issues  Last interview took too long  Survey is voluntary  Other reluctance  Does not understand/Asks questions about survey  Survey content does not apply  Hang-up/slams door on FR  Hostile or threatens FR  Other HH members tell R not to participate  Talk only to specific household member  Respondent requests same FR as last time  Gave that information last time  Asked too many personal questions last time  Intends to quit survey  Other—specify  No concerns  No Concerns  Respondent “doorstep” concerns  Assigned category  Privacy concerns  Privacy/government concerns  Anti-government concerns  Too busy  Too busy/logistical concerns  Interview takes too much time  Breaks appointments (puts off FR indefinitely)  Not interested/Does not want to be bothered  Too many interviews  Scheduling difficulties  Family issues  Last interview took too long  Survey is voluntary  Other reluctance  Does not understand/Asks questions about survey  Survey content does not apply  Hang-up/slams door on FR  Hostile or threatens FR  Other HH members tell R not to participate  Talk only to specific household member  Respondent requests same FR as last time  Gave that information last time  Asked too many personal questions last time  Intends to quit survey  Other—specify  No concerns  No Concerns  Respondent “doorstep” concerns  Assigned category  Privacy concerns  Privacy/government concerns  Anti-government concerns  Too busy  Too busy/logistical concerns  Interview takes too much time  Breaks appointments (puts off FR indefinitely)  Not interested/Does not want to be bothered  Too many interviews  Scheduling difficulties  Family issues  Last interview took too long  Survey is voluntary  Other reluctance  Does not understand/Asks questions about survey  Survey content does not apply  Hang-up/slams door on FR  Hostile or threatens FR  Other HH members tell R not to participate  Talk only to specific household member  Respondent requests same FR as last time  Gave that information last time  Asked too many personal questions last time  Intends to quit survey  Other—specify  No concerns  No Concerns  Respondent “doorstep” concerns  Assigned category  Privacy concerns  Privacy/government concerns  Anti-government concerns  Too busy  Too busy/logistical concerns  Interview takes too much time  Breaks appointments (puts off FR indefinitely)  Not interested/Does not want to be bothered  Too many interviews  Scheduling difficulties  Family issues  Last interview took too long  Survey is voluntary  Other reluctance  Does not understand/Asks questions about survey  Survey content does not apply  Hang-up/slams door on FR  Hostile or threatens FR  Other HH members tell R not to participate  Talk only to specific household member  Respondent requests same FR as last time  Gave that information last time  Asked too many personal questions last time  Intends to quit survey  Other—specify  No concerns  No Concerns  A.3 Post-Survey Interviewer Questions A.3.1 Respondent Cooperation How cooperative was this respondent during this interview? (very cooperative, somewhat cooperative, neither cooperative nor uncooperative, somewhat cooperative, or very uncooperative) A.3.2 Respondent Effort How much effort would you say this respondent put into answering the expenditure questions during this survey? (a lot of effort, a moderate amount of effort, or a bare minimum of effort) Appendix B: Variability of Weights In weighted analyses of incomplete-data patterns, it is useful to assess potential variance inflation that may arise from the heterogeneity of the weights that are used. For the current consent-to-link case, table 6 presents summary statistics for three sets of weights: the unadjusted weights wi (standard complex design weights) for all applicable units in the full sample; the same weights for sample units with consent (i.e., the units that did not object to record linkage); and propensity-adjusted weights wpi=wi/p^i for the sample units with consent, where p^i is the estimated propensity that unit i will consent to record linkage, based on model 2 in table 3. Table 6. Variability of Weights   Weights from full sample  Unadjusted weights for sample units with consent  Propensity-adjusted weights for sample units with consent  Propensity-decile adjusted weights for sample units with consent  Propensity-quintile adjusted weights for sample units with consent  n  4,893  3,951  3,915  3,915  3,915  1+ CVwt2  1.13  1.13  1.16  1.16  1.15  IQR/q0.5  0.875  0.877  0.858  0.853  0.851  q0.75/q0.25  1.357  1.359  1.448  1.463  1.468  q0.90/q0.10  1.970  1.966  2.148  2.178  2.124  q0.95/q0.05  2.699  2.665  3.028  2.961  2.898    Weights from full sample  Unadjusted weights for sample units with consent  Propensity-adjusted weights for sample units with consent  Propensity-decile adjusted weights for sample units with consent  Propensity-quintile adjusted weights for sample units with consent  n  4,893  3,951  3,915  3,915  3,915  1+ CVwt2  1.13  1.13  1.16  1.16  1.15  IQR/q0.5  0.875  0.877  0.858  0.853  0.851  q0.75/q0.25  1.357  1.359  1.448  1.463  1.468  q0.90/q0.10  1.970  1.966  2.148  2.178  2.124  q0.95/q0.05  2.699  2.665  3.028  2.961  2.898  Table 6. Variability of Weights   Weights from full sample  Unadjusted weights for sample units with consent  Propensity-adjusted weights for sample units with consent  Propensity-decile adjusted weights for sample units with consent  Propensity-quintile adjusted weights for sample units with consent  n  4,893  3,951  3,915  3,915  3,915  1+ CVwt2  1.13  1.13  1.16  1.16  1.15  IQR/q0.5  0.875  0.877  0.858  0.853  0.851  q0.75/q0.25  1.357  1.359  1.448  1.463  1.468  q0.90/q0.10  1.970  1.966  2.148  2.178  2.124  q0.95/q0.05  2.699  2.665  3.028  2.961  2.898    Weights from full sample  Unadjusted weights for sample units with consent  Propensity-adjusted weights for sample units with consent  Propensity-decile adjusted weights for sample units with consent  Propensity-quintile adjusted weights for sample units with consent  n  4,893  3,951  3,915  3,915  3,915  1+ CVwt2  1.13  1.13  1.16  1.16  1.15  IQR/q0.5  0.875  0.877  0.858  0.853  0.851  q0.75/q0.25  1.357  1.359  1.448  1.463  1.468  q0.90/q0.10  1.970  1.966  2.148  2.178  2.124  q0.95/q0.05  2.699  2.665  3.028  2.961  2.898  In keeping with standard analyses that link heterogeneity of weights with variance inflation (e.g., Kish 1965, Section 11.7) the second row of table 6 reports the values 1+CVwt2 where CVwt2 is the squared coefficient of variation of the weights under consideration. For all three cases, 1+CVwt2 is only moderately larger than one. The remaining rows of table 6 provide some related nonparametric summary indications of the heterogeneity of weights, based on, respectively, the ratio of the interquartile range of the weights divided by the median weight; the ratio of the third and first quartiles of the weights; the ratio of the 90th and 10th percentiles of the weights; and the ratio of the 95th and 5th percentiles of the weights. In each case, the ratios for the propensity-adjusted weights are only moderately larger than the corresponding ratios of the unadjusted weights. Thus, the propensity adjustments do not lead to substantial inflation in the heterogeneity of weights for the CEQ analyses considered in this paper. Appendix C: Variance Estimators and Goodness-of-Fit Tests C.1 Notation and Variance Estimators In this paper, all inferential work for means, proportions and logistic regression coefficient vectors β are based on estimated variance-covariance matrices computed through balanced repeated replication that use 44 replicate weights and a Fay factor K = 0.5. Judkins (1990) provides general background on balanced repeated replication with Fay factors. Bureau of Labor Statistics (2014) provides additional background on balanced repeated replication variance estimation for the CEQ. To illustrate, for a coefficient vector β considered in table 3, let β^ be the survey-weighted estimator based on the full-sample weights, and let β^r be the corresponding estimator based on the r-th set of replicate weights. Then the standard errors reported in table 3 are based on the variance estimator:   var(β^)=1R(1-K)2∑r=1R(β^r-β^)(β^r-β^)'. where R = 44 and K = 0.5. Similar comments apply to the standard errors for the mean and proportion estimates reported in tables 1 and 2; for the bias analyses reported in table 4; and for the standard errors used in the construction of the pointwise confidence intervals presented in figures 3 and 4. C.2 Goodness-of-Fit Test In keeping with Archer and Lemeshow (2006) and Archer, Lemeshow, and Hosmer (2007), we also considered goodness-of-fit tests for the logistic regression models 1 and 2 based on the F-adjusted mean residual test statistic QF-adj. as presented by Archer and Lemeshow (2006) and Archer et al. (2007). A direct extension of the reasoning considered in Archer and Lemeshow (2006) and Archer et al. (2007) indicates that under the null hypothesis of no lack of fit, and additional conditions, QF-adj.is distributed approximately as a central F(g-1, f-g+2) random variable where f=43 and g=10. Based on the approach outlined above, table 7 reports the p-values associated with QF-adj. computed for each of models 1 and 2 through use of the svylogitgof.ado module for the Stata package (http://www.people.vcu.edu/∼kjarcher/Research/documents/svylogitgof.ado). For both models, the p-values do not provide any indication of lack of fit. See also Korn and Graubard (1990) for further discussion of distributional approximations for variance-covariance matrices estimated from complex survey data, and for related quadratic-form test statistics. Additional study of the properties of these test statistics would be of interest, but is beyond the scope of the current paper. Published by Oxford University Press on behalf of the American Association for Public Opinion Research 2017. This work is written by US Government employees and is in the public domain in the US. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Survey Statistics and Methodology Oxford University Press

Methods for Exploratory Assessment of Consent-to-Link in a Household Survey

Loading next page...
 
/lp/ou_press/methods-for-exploratory-assessment-of-consent-to-link-in-a-household-cy49K0XnA3
Publisher
Oxford University Press
Copyright
Published by Oxford University Press on behalf of the American Association for Public Opinion Research 2017. This work is written by US Government employees and is in the public domain in the US.
ISSN
2325-0984
eISSN
2325-0992
D.O.I.
10.1093/jssam/smx031
Publisher site
See Article on Publisher Site

Abstract

Abstract There is increasing interest in linking survey data to administrative records to reduce respondent burden and enhance the amount and quality of information available on sample respondents. In many cases, legal constraints or societal norms require survey organizations to obtain informed consent from sample units before linking survey responses with administrative data. Guiding such efforts is a growing empirical literature examining factors that impact respondents’ consent decisions and the success of linkage attempts, as well as evaluations of potential differences between consenting and non-consenting respondents. This paper outlines a range of options that statistical organizations can consider for evaluation and testing of linked datasets. We apply methods for assessing consent propensity and consent bias to data from the U.S. Consumer Expenditure Survey, and investigate the impacts of demographic, socio-economic, and attitudinal variables on respondents’ consent-to-link propensities. We then analyze potential consent-to-link biases in mean and quantile estimates of several economic variables, by comparing different propensity-adjusted and unadjusted estimates, and by comparing estimates from consenting and non-consenting respondents. We contrast several estimation approaches, and discuss implications of our findings for consent-propensity assessments and for approaches to minimize risks of consent-to-link bias. 1. INTRODUCTION Survey organizations face many challenges in their efforts to produce high-quality survey data. The costs of data collection and the demand for data products are greater than ever, and survey budgets often are under serious strain to meet these demands. Declining survey response rates further complicate cost and data-quality considerations. Given these challenges, survey organizations increasingly are exploring the possibility of linking survey data to administrative records. Combining survey and administrative data on the same sample unit has the potential to reduce the cost, length, and perceived burden of a survey; enrich our understanding of the underlying substantive phenomena; and offer a mechanism for targeted assessments of survey error components. Linking survey data to administrative data sources on the same individual or household requires matching records from one dataset to the other. The efficiency and success of this matching process depends on the variables and linkage strategy used to establish the link. Exact matching techniques are most successful when unique identifying information such as a social security number (SSN) is available, but these techniques can also be effective in the absence of unique identifiers when combinations of other personal variables are compared (e.g., last name, date of birth, street name) (see Herzog, Scheuren, and Winkler 2007 for a review of statistical linkage techniques and related data-cleaning issues). Before any linkage attempt can be made, however, most countries require that survey respondents give their informed consent to link, and consent rates can vary considerably—from as low as 19% to as high as 96% (Sakshaug, Couper, Ofstedal, and Weir 2012). Lower consent rates are potentially a major challenge to wider adoption of record-linkage in statistical agencies because they increase the risk of bias in estimates derived from combined data (to the extent that there are systematic differences in key outcome measures between those who consent to link and those who do not). As interest in and adoption of record-linkage methods have increased, so too have investigations into factors associated with respondents’ consent decisions and their potential impact on consent bias. In general, consent-to-link phenomena can be viewed as a type of incomplete-data problem and, thus, can make use of the broad spectrum of conceptual and methodological tools that have developed for work with incomplete data. Examples include assessment of cognitive and social processes that lead to survey response (or consent to link); modeling and diagnostic tools for estimation and evaluation of related propensity models; and empirical assessment of biases resulting from the incomplete-data pattern. To date, direct examinations of consent bias in estimates derived from combined data are extremely rare in the literature because they require researcher access to administrative records for both consenters and non-consenters (cf. Sakshaug and Kreuter 2012; Kreuter, Sakshaug, and Tourangeau 2016). Most studies of linkage consent therefore have used data from survey respondents—available for both consenters and non-consenters—to identify characteristics correlated with consent propensity. These studies provide an indirect means of assessing the potential risk of consent bias if the underlying consent propensities are related to differences in respondents’ administrative record profiles. Findings from this literature indicate that linkage consent is often associated with respondent demographics (e.g., age, education, income), indicators of survey reluctance (e.g., prior nonresponse in a panel survey), and features of the survey (e.g., wording, placement, and timing of consent requests), but the magnitude and direction of these effects vary across studies (Bates 2005; Dahlhamer and Cox 2007; Sala, Burton, and Knies 2012; Sakshaug, Tutz, and Kreuter 2013). Only very recently have researchers started to develop theory-based hypotheses about the mechanisms of consent decisions, and to incorporate more sophisticated analytic approaches to test these hypotheses (Sala et al. 2012; Sakshaug et al. 2012; Mostafa 2015). Finally, empirical assessment of consent-to-link propensity patterns and related potential consent biases naturally involve a complex set of trade-offs involving (a) the degree to which a given set of test conditions are relevant to current or prospective production conditions; (b) the ability to control applicable design factors and to measure relevant covariates, within the context of those current production conditions; (c) the ability to measure and model specific portions of the complex processes that lead to respondent consent and cooperation in a given setting; and (d) constraints on resources, including both the direct costs of testing consent-to-link options and the indirect costs arising from the potential impact of testing on current survey production. The remainder of this paper considers some aspects of issues (a)–(d), with emphasis on exploratory analyses for one case study that was embedded within a current survey production process. In the next section, we review the literature on consent decisions and consent bias. In section 3, we describe in detail a specific consent-to-link case involving the US Consumer Expenditure Survey (CEQ) and then present consent propensity models and our evaluation methodology. The results of our descriptive and multivariate examinations of consent propensity and assessments of consent bias are presented in section 4. We summarize and discuss the implications of our findings in section 5. Appendix A provides detailed descriptions of the analytic variables. Appendix B presents technical results on the variability of weights used in the CEQ dataset. Appendix C discusses some features of the goodness-of-fit tests used for the models considered in section 4. In addition, an online Appendix D (Supplementary Materials) presents some related conceptual material and numerical results for hypothesis testing that led to the modeling results summarized in section 4. 2. LITERATURE REVIEW 2.1 Factors Affecting Linkage Consent The earliest investigations of consent-to-linkage effects were mostly conducted in epidemiology and health studies that requested patients’ consent to access their medical records (Woolf, Rothemich, Johnson, and Marsland 2000; Dunn, Jordan, Lacey, Shapley, and Jinks 2004; Kho et al. 2009), but more recent studies have assessed consent in general population surveys with linkage requests to an array of administrative data sources (Knies, Burton, and Sala 2012; Sala et al. 2012; Sakshaug et al. 2012, 2013). In this section we summarize findings from this research on the factors that affect consent to data linkage. 2.1.1 Respondent demographics Linkage consent studies largely have focused on respondents’ sociodemographic characteristics, most often age, gender, ethnicity, education, and income (Kho et al. 2009; Fulton 2012). These variables are widely available across surveys, and although they are unlikely to have direct causal impact on most consent decisions, they provide indirect measures of psychosociological factors that may influence those choices. Demographic differences between consenters and non-consenters are common, but the patterns of findings differ across studies. For example, older individuals frequently have been found to be less likely to consent to record linkage than younger people (Dunn et al. 2004; Bates 2005; Dahlhamer and Cox 2007; Huang, Shih, Chang, and Chou 2007; Pascale 2011; Sala et al. 2012; Al Baghal, Knies, and Burton 2014). But some studies have found the opposite effect (Woolf et al. 2000; Beebe et al. 2011) or no age effect (Kho et al. 2009). Males often consent at higher rates than females (Dunn et al. 2004; Bates 2005; Woolf et al. 2000; Sala et al. 2012; Al Baghal et al. 2014), but some studies find no gender effect (Pascale 2011; Sakshaug et al. 2012). Consent propensities for ethnic minorities and non-citizens tend to be lower than for majority groups and citizens (Woolf et al. 2000; Beebe et al. 2011; Al Baghal et al. 2014; Mostafa 2015), although not all studies show these effects (Bates 2005; Kho et al. 2009). Similar inconsistencies are evident across studies for the effects of education and income on consent (Kho et al. 2009; Fulton 2012; Sakshaug et al. 2012; Sala et al. 2012). Other respondent demographic and household characteristics have been examined less frequently (e.g., marital status, employment status, household size, and owner/renter), again with mixed findings (Olson 1999; Jenkins, Cappellari, Lynn, Jackle, and Sala 2006; Al Baghal et al. 2014; Mostafa 2015). 2.1.2 Respondent attitudes Attitudes can have a powerful impact on thought and behavior, and there is a long history of survey researchers attempting to measure respondents’ attitudes and their impact on various survey outcomes (e.g., Goyder, 1986). Particular attention has been given to respondents’ attitudes about privacy and confidentiality. Research conducted by the US Census Bureau going back to the 1990s demonstrates that concerns about personal privacy and data confidentiality have increased in the general public and that these attitudes are associated with lower participation rates in the decennial census (Singer 1993, 2003) and more negative attitudes toward the use of administrative records (Singer, Bates, and Van Hoewyk 2011). Privacy and confidentiality concerns can influence record linkage consent, as well. Both direct measures of privacy concerns (respondent self-reports) and indirect indicators (item refusals on financial questions) have been shown to be negatively associated with consent (Sakshaug et al. 2012; Sala, Knies, and Burton 2014; Mostafa 2015). Similarly, Sakshaug et al. (2012) demonstrated that the more confidentiality-related concerns respondents expressed to interviewers in a previous survey wave, the less likely they were to subsequently consent to data linkage. And Sala et al. (2014) found that concern about data confidentiality was the most frequent reason given by respondents who declined a linkage request. There also is evidence that trust (in other people, in government) and civic engagement (volunteering, political involvement) are positively related to consent (Sala et al. 2012; Al Baghal et al. 2014). 2.1.3 Saliency Respondents’ interest in topics related to the record request or their experiences with organizations that house those records can also affect consent decisions. For example, a number of studies have found that respondents have a higher propensity to accept medical consent requests when they are in poorer health or have symptoms germane to the survey subject (Woolf et al. 2000; Dunn et al. 2004; Dahlhamer and Cox 2007; Beebe et al. 2011). One explanation for this finding is that consent requests on topics salient to respondents enhance the perceived benefits of record linkage (e.g., more comprehensive medical evaluation or the general advancement of knowledge about a disease relevant to the respondent) or reduce the perceived risks (e.g., by inducing more extensive cognitive processing of the request) (Groves, Singer, and Corning 2000). In addition to topic saliency, respondents’ existing relationships with government agencies also can play a role in their consent decisions. Studies by Sala et al. (2012), Sakshaug et al. (2012), and Mostafa (2015), for example, found that individuals who received government benefits (e.g., welfare, food stamps, veterans’ benefits) were more likely to consent to economic data linkage than those who did not. These results again suggest that the salience of (and attitudes toward) service-providing government agencies may make some respondents more amenable to linkage requests involving those agencies. 2.1.4 Socio-environmental features The respondents’ environments help to shape the context in which consent decisions are made, and a handful of studies have examined associations between area characteristics and attitudes toward use of administrative records and consent decisions. For example, Singer et al. (2011) found that individuals living in the South and Mid-Atlantic regions of the country had more favorable attitudes about administrative record use by the US Census Bureau than those living in other regions of the country. Studies of actual consent and linkage rates have demonstrated regional variations as well, with higher rates in the South and Midwest and lower rates in parts of the Northeast (e.g., Olson 1999; Dahlhamer and Cox 2007). Consent rates also can vary by urban status. Consistent with urbanicity effects seen in the literature on survey participation and pro-social behavior (Groves and Couper, 1998; Mattis, Hammond, Grayman, Bonacci, Brennan, et al. 2009), respondents living in urban areas have been found to be less likely to consent than those living in non-urban areas (c.f. Jenkins et al 2006; Dahlhamer and Cox 2007; Al Baghal et al. 2014, who show a marginally significant positive effect for urbanicity). Together such area effects may indicate the influence of underlying ecological factors within those communities (e.g., differences in population density, crime, social engagement), but may also reflect differences in survey operations (e.g., in staff, protocol, and training) clustered within those geographic areas. A recent study by Mostafa (2015) found area characteristics by themselves added little explanatory power to models of consent propensity, suggesting that respondent and interview characteristics may be more important factors. 2.1.5 Interviewer characteristics Interviewer attributes and behaviors can have significant impact on survey participation and data quality (O’Muircheartaigh and Campanelli 1999; West, Kreuter, and Jaenichen 2013), including linkage consent decisions. Studies investigating the impact of interviewer demographics generally find that they are unrelated to the consent outcome (Sakshaug et al. 2012; Sala et al. 2012), although there is some evidence of a positive effect of interviewer age on consent (Krobmacher and Schroeder 2013; Al Baghal et al. 2014). Interviewer experience has shown mixed effects. The amount of time spent working as an interviewer overall (i.e., job tenure) is either unrelated to consent (Sakshaug et al. 2012; Sala et al. 2012) or can actually have a small negative impact (Sakshaug et al. 2013; Al Baghal et al. 2014). Interviewers’ survey-specific experience, as measured by the number of interviews already completed prior to the current consent request, shows similar effects (Sakshaug et al. 2012; Sala et al. 2012). One aspect of interviewer experience that is positively related to consent in these studies is past performance in gaining respondent consent. Sala et al. (2012) and Sakshaug et al. (2012) found that the likelihood of consent increased with the number of consents obtained earlier in the field period. These authors also attempted to identify interviewer personality traits and attitudes that could affect respondent consent decisions, but largely failed to find significant effects. The one exception was that respondent consent was positively related to interviewers’ own willingness to consent to linkage (Sakshaug et al. 2012). 2.1.6 Survey design features The way in which the consent requests are administered can impact linkage consent. Consent rates appear to be higher in face-to-face surveys than in phone surveys, though there are relatively few mode studies that have examined this phenomenon (Fulton 2012). Consent questions that ask respondents to provide personal identifiers (e.g., full or partial SSNs) as matching variables produce lower consent rates than those that do not (Bates 2005). This finding and advancements in statistical matching techniques prompted the Census Bureau to change its approach to gaining linkage consent in 2006, and it has since adopted a passive, opt-out consent procedure in which respondents are informed of the intent to link, and consent is assumed unless respondents explicitly object (McNabb, Timmons, Song, and Puckett 2009). These implicit consent procedures (as they are sometimes called) result in higher consent rates than opt-in approaches where respondents must affirmatively state their consent (Bates 2005; Pascale 2011). See also Das and Couper (2014). Since most surveys employ opt-in formats, researchers have focused on the potential effects of the wording or framing of these questions. Consent framing experiments vary factors mentioned in the request that are thought to be persuasive to respondents, for example, highlighting the quality benefits associated with linkage, the reduction in survey collection costs, or the time savings for respondents. Evidence of framing effects in these studies is surprisingly weak, however. Bates, Wroblewski, and Pascale (2012) found that respondents reported more positive attitudes toward record linkage under cost- and time-savings frames, but the study did not measure actual linkage consent propensities. In Sakshaug and Kreuter (2014), a time-savings frame produced higher consent rates for web survey respondents than a neutrally worded consent question, but this is the only study in the literature to find significant question-framing effects (Pascale 2011; Sakshaug et al. 2013). The timing of consent requests appears to have some influence on likelihood of consent. Although it is common practice to delay asking the most sensitive items like linkage-consent requests until near the end of the questionnaire, recent empirical evidence indicates that this may not be optimal. Sakshaug et al. (2013) found that respondents were more amenable to consent requests administered at the beginning of the survey than at the end and suggest that the proximity of the survey and linkage requests may reinforce respondents’ desire or inclination to be consistent (i.e., agree to both). This result could inform potentially promising adaptive design interventions (e.g., asking for consent early in the interview, then skipping subsequent burdensome questions for consenters). Sala et al. (2014) obtained higher consent rates when the request was asked immediately following a series of questions on a related topic rather than waiting until the end of the survey. The authors reasoned that contextual placement of the linkage question increases the salience of the request and induces more careful consideration by respondents. Both explanations find some support in the broader psychological literature on compliance, cognitive dissonance, and context effects, but further research is needed to evaluate these and other mechanisms (e.g., survey fatigue) underlying consent placement effects (Sala et al. 2014). 2.2 Analytic Approaches to Assessing Consent Bias Early linkage consent studies simply looked for evidence of sample bias (i.e., differences in sample composition for consenters and non-consenters). Most recent studies use logistic regression models to identify factors that influence consent and infer potential consent bias (e.g., differential consent to medical-records linkage by respondent health status). Several studies have employed multi-level models to assess the impact of interviewers on consent propensity (e.g., Fulton, 2012; Sala et al., 2012; Mostafa 2015), and others have jointly modeled respondents’ consent propensities on multiple consent items in a given survey (e, g., Mostafa 2015). Studies that have examined direct estimates of consent bias, using administrative records available for both consenters and non-consenters, are much less common in the literature. Recent research by Sakshaug and his colleagues are notable exceptions (Sakshaug and Kreuter 2012; Sakshaug and Huber 2016). Using administrative data linked to German panel survey data, they have compared the magnitude of consent biases to bias estimates for other error sources (nonresponse, measurement) and the longitudinal changes in these biases. Given evidence of potential consent bias (e.g., differential consent by specific demographic groups), one promising approach that has not yet been explored in this literature is to adopt propensity weighting methods that are widely used in nonresponse adjustment. Traditionally, propensity weighted nonresponse adjustments are accomplished by modeling response propensities using logistic regression and auxiliary data available for both respondents and nonrespondents, and then using the inverse of the modeled propensity as a weight adjustment factor (Little 1986). If the predicted propensity is unbiased, this adjustment method may reduce the potential bias due to nonresponse. Of course, bias reduction is predicated on correct model specification, and this may be particularly challenging in consent propensity applications given the absence of well-developed consent theories and inconsistency in the effects of many its predictors. Application of these ideas in the analysis of “consent-to-link” patterns is complicated by two issues. First, the decision of a respondent to provide formal consent to link does not guarantee the successful linkage of that unit to a given external data source. For example, a nominally consenting respondent may fail to provide specific forms of information (e.g., account numbers) required to perform the linkage to the external source; the external source itself may be subject to incomplete-data problems; or there may be other problems with imperfect linkage as outlined in Herzog et al. (2007). Consequently, it can be useful to consider a decomposition:   pLx; βL=pCx; βC pL|Cx; βL|C (2.1) where pLx; βL is the probability that a unit with predictor variable x will ultimately have a successful linkage to a given data source; pCx; βC is the probability that this unit will provide nominal informed consent to link; pL|Cx; βL|C is the probability of successful linkage, conditional on the unit providing consent; and βL, βC and βL|C are the parameter vectors for the three respective probability models. Note that to some degree, the first factor, pCx; βC, is analogous to the probability of unit response in a standard survey setting, and the second factor pL|Cx; βL|C is analogous to the probability of section or item response. In addition, note that the conditional probabilities pL|Cx; βL|C may depend on a wide range of factors, including perceived sensitivity of a given set of linkage variables that the respondent may need to provide 2.3 Options for Exploratory Analyses of Consent-to-Link Ideally, one would explore informed-consent issues by estimating all of the parameters of model (2.1) and by evaluating potential non-consent-based biases for estimators of a large number of population parameters of interest. However, in-depth empirical work with consent for record linkage imposes a substantial burden on field data collection. In addition, large-scale implementation of record linkage incurs substantial costs related to production systems and “data cleaning” for the variables on which we intend to link and also incurs a risk of disruption of the ongoing survey production process. Consequently, it is important to identify alternative approaches that allow initial exploration of some aspects of model (2.1) with considerably lower costs and risks. For example, one could consider the following sequence of exploratory options: Simple lab studies. This step has the advantages of not disrupting production and relatively low costs. However, results may not align with “real world” production conditions (per the preceding literature results on interviewer characteristics) nor full population coverage (per the comments on respondent demographics and socio-environmental features). Addition of simple consent-to-link questions to standard production instruments. This approach may incur a relatively low risk of disruption of production and relatively low incremental costs and has the advantage of being naturally embedded in production conditions. Same as (ii) but with actual linkage to administrative records. This approach incurs some additional cost (to carry out record linkage and related data-management and cleaning processes). In addition, this option may incur some additional respondent burden arising from collection of information required to enhance the probability of a successful link. On the other hand, option (iii) allows assessment of additional linkage-related issues, (e.g., cases in which conditional linkage probabilities pL|Cx; βL|C are less than 1). Full-scale field tests of consent-to-link. This option will incur higher costs and higher risks of disruption of production processes, but will generally be considered necessary before an organization makes a final decision to implement record linkage in production processes. Also, in some cases, interviewer attitudes and behaviors may differ between cases (ii) and (iv). Due to the balance of potential costs, risks and benefits arising from options (i) through (iv), it may be useful to focus initial exploratory attention on options (i) and (ii), and then consider use of options (iii) and (iv) for cases in which the initial results indicate reasonable prospects for successful implementation. The current paper presents a case study of option (ii) for the Consumer Expenditure Survey, with principal emphasis on evaluation of the extent to which variability in the propensity to consent may lead to biases in unadjusted estimators of some commonly studied economic variables when restricted to consenting sample units; and evaluation of the extent to which simple propensity-based adjustments may reduce those potential biases. 3. DATA AND METHODS 3.1 Possible Linkage of Government Records with Sample Units from the US Consumer Expenditure Survey In this paper, we extend the survey-based approach to assessing consent propensity and consent bias in the context of a large household expenditure survey, the US Consumer Expenditure Quarterly Interview Survey (CEQ), sponsored by the Bureau of Labor Statistics (BLS). The CEQ is an ongoing, nationally representative panel survey that collects comprehensive information on a wide range of consumers’ expenditures and incomes, as well as the characteristics of those consumers. It is designed to collect one year’s worth of expenditure data from sample units through five interviews; the first interview is for bounding purposes only, and the remaining four interviews are conducted at three-month intervals.1 It is a rather long and burdensome survey, and the ability to link CEQ data to relevant administrative data sources (e.g., IRS data for income/assets) could eliminate the need to ask respondents to report some of these data themselves. In principle, linkage with administrative records also could allow one to capture information that would be difficult or impossible to collect through a survey instrument. This latter motivation is of some potential interest for CE, but at present may be somewhat secondary relative to reduction of current burden levels. The CEQ employs a complex sample using a stratified-clustered design, and each calendar quarter approximately 7,100 usable interviews are conducted. In 2011, the CEQ achieved a response rate of 71.5% (BLS 2014). The survey is administered by computer-assisted personal interviewing (CAPI), either by personal visit or by telephone. Mode selection is determined jointly by the interviewer and respondent, though personal visits are encouraged, particularly in the second and fifth interviews when more detailed financial information is collected. Telephone interviewing is conducted by the same CE interviewer assigned to the case, using the same CAPI instrument as used in the personal visit interviews. Beginning in 2011, BLS conducted research to explore the feasibility and potential impacts of integrating administrative data with CEQ survey responses. CEQ respondents who completed their final interview were asked whether they would object to combining their survey answers with data from other government agencies (Davis, Elkin, McBride, and To 2013, Section III). Nearly 80% of respondents had no objection to linkage. Although no actual data linkage occurred, we use this 2011 data to explore the extent to which prospective replacement of survey responses with administrative data could impact the quality of production estimates. To do this, we develop and compare consent propensity models that incorporate demographic, household, environmental, and attitudinal predictors suggested by the literature on linkage consent, attitudes towards administrative record use, and survey nonresponse. We then explore an approach for examining potential consent bias, by comparing full-sample, consent-only, and propensity-weight-adjusted expenditure estimates. This study used CEQ data from April 2011 through March 2012. During that period, respondents who completed their fifth and final CEQ interview were administered the following data-linkage consent item: “We’d like to produce additional statistical data, without taking up your time with more questions, by combining your survey answers with data from other government agencies. Do you have any objections?” Of the 5,037 respondents who were asked this question, 78.4 percent had no objections, 18.7 percent objected, and another 2.8 percent gave a “don’t know” response or were item nonrespondents on this item. We restrict our analyses to a dichotomous outcome indicator for the 4,893 respondents who consented or explicitly objected to the linkage request. 3.2 Consent Propensity Models We develop logistic regression models that estimate sample members’ propensity to consent to the CEQ linkage request: the dependent variable takes a value of 1 if respondents consent to linkage and a value of 0 if they do not consent. Model specifications were developed through fairly extensive exploratory analyses (e.g., examinations of descriptive statistics, theory-based bivariate logistic regressions, and stepwise logistic regressions). Some results from these exploratory analyses are provided in the online Appendix D. To account for the complex stratified sampling design of CEQ, the analyses were conducted with the SAS surveylogistic procedure. All point estimates reported in this paper are based on standard complex design weights, and all standard errors are based on balanced repeated replication (BRR) using 44 replicate weights, with a Fay factor, K = 0.5. In addition, we used F-adjusted Wald statistics to evaluate Goodness-of-fit (GOF) for our models (table 7 in Appendix C). Table 7. F-adjusted Mean Residual Goodness-of-Fit Test Model  F-adjusted goodness-of-fit test p-values  1  0.292  2  0.810  Model  F-adjusted goodness-of-fit test p-values  1  0.292  2  0.810  Table 7. F-adjusted Mean Residual Goodness-of-Fit Test Model  F-adjusted goodness-of-fit test p-values  1  0.292  2  0.810  Model  F-adjusted goodness-of-fit test p-values  1  0.292  2  0.810  Table 8. Multivariate Logistic Models Predicting Consent-to-Link (Weighted) Variable  Model 1   Model 2   Model A   Model B   Model C   Model D   Model E   Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Demographic characteristics                              Age group (32–65)                                18–32  0.3435***  0.0633  0.3389***  0.0634  0.3044***  0.0717  0.3319***  0.0742  0.3282***  0.0728  0.3425***  0.0640  0.3368***  0.0639    65 +  −0.2692***  0.0645  −0.2532***  0.0656  −0.1971**  0.0613  −0.2387***  0.0608  −0.2374***  0.0630  −0.2694***  0.0644  −0.2538***  0.0653  Gender (Male)                                Female  −0.0161  0.0426  −0.0229  0.0421  −0.0396  0.0359  −0.0388  0.0363  −0.0020  0.0427  −0.0152  0.0425  −0.0207  0.0419  Race (White)                                Non-white  0.0949  0.1264  0.0612  0.1260  −0.0781  0.1059  −0.0192  0.1056  0.0322  0.1155  0.0938  0.1269  0.0593  0.1264  Spanish interview (No)                                Yes  −0.4234  0.2923  −0.4452  0.2790  −0.4044  0.2930  −0.4271  0.3120  −0.4801  0.3118  −0.4287  0.2973  −0.4576  0.2846  Education group (HS grad)                                Less than HS  0.3097  0.1879  0.2926  0.1835  0.0142  0.1349  0.0186  0.1383  0.3317†  0.1838  0.3081  0.1885  0.2894  0.1844    Some college  0.4016**  0.1423  0.3739*  0.1389  −0.0986  0.1087  −0.0815  0.1129  0.4066**  0.1389  0.3996**  0.1442  0.3695*  0.1409    Associate’s   degree  −0.3321  0.2041  −0.3150  0.1993  −0.1109  0.1100  −0.1249  0.1145  −0.2858  0.2160  −0.3326  0.2036  −0.3160  0.1990    Bachelor’s   degree  −0.0654  0.2112  −0.0663  0.2082  0.0383  0.0719  0.0425  0.0742  −0.0230  0.2037  −0.0618  0.2127  −0.0581  0.2095    Advance   degree  −0.1747  0.2762  −0.1512  0.2773  0.1899  0.1215  0.1941  0.1235  −0.2738  0.2482  −0.1720  0.2773  −0.1452  0.2796  Home owner (Renter)                                Owner  −0.2767†  0.1453  −0.3233*  0.1436  −0.3612**  0.1277  −0.3589**  0.1314  −0.2703†  0.1397  −0.2754†  0.1448  −0.3198*  0.1433  Total expenditures (Log)  −0.0605  0.0799  −0.0083  0.0800  0.1025  0.0698  −0.0372  0.0754  −0.0419  0.0746  −0.0594  0.0802  −0.0065  0.0801  Income group                                Less than   $8,181  −0.2155***  0.0604          −0.4182***  0.0561  −0.4247***  0.0577  −0.2142**  0.0609      Income imputed (No)                                Yes  −0.1487**  0.0513                  −0.1481**  0.0510      Race × gender  −0.1874†  0.0956  −0.1953*  0.0923          −0.2268*  0.0930  −0.1873†  0.0957  −0.1946*  0.0923  Owner × education                                Less than HS  −0.4931*  0.2066  −0.4632*  0.1996          −0.4750*  0.2045  −0.4929*  0.2067  −0.4637*  0.2002    Some college  −0.6575***  0.1567  −0.6370***  0.1613          −0.6764***  0.1438  −0.6548***  0.1578  −0.6309***  0.1633    Associate’s   degree  0.3407†  0.2013  0.3402†  0.1986          0.2616  0.2090  0.3401†  0.2007  0.3387†  0.1978    Bachelor’s   degree  0.1385  0.2402  0.1398  0.2368          0.1057  0.2296  0.1358  0.2409  0.1335  0.2376   Advance degree  0.4713  0.2847  0.4226  0.2870          0.5867*  0.2583  0.4691  0.2854  0.4183  0.2890  Environmental features                              Region (Northeast)                                Midwest  0.2097†  0.1234  0.2111†  0.1163  0.2602*  0.1100  0.2457*  0.1196  0.2352†  0.1208  0.2098†  0.1232  0.2111†  0.1159    South  0.2451*  0.1190  0.2408*  0.1187  0.1841  0.1141  0.2017†  0.1149  0.2129†  0.1152  0.2446*  0.1189  0.2399*  0.1187    West  −0.2670*  0.1055  −0.2557*  0.1066  −0.2248*  0.0943  −0.2402**  0.1001  −0.2441*  0.1019  −0.2678*  0.1061  −0.2574*  0.1071  Urbanicity (Rural)                                Urban  −0.0268  0.0829  −0.0234  0.0824  −0.0253  0.0818  −0.0226  0.0833  −0.0249  0.0821  −0.0268  0.0829  −0.0233  0.0823  R attitude proxies                              Converted refusal (No)                                Yes  −0.0721  0.0756  −0.0838  0.0763              −0.0714  0.0753  −0.0818  0.0760  Effort (Moderate)                                A lot of effort  0.5454*  0.2081  0.6142**  0.2065              0.5418*  0.2101  0.6051**  0.2082    Bare   minimum   effort  −0.4879***  0.1365  −0.5721***  0.1371              −0.4842**  0.1387  −0.5626***  0.1389  Doorstep concerns (None)                                Too busy  −0.2207  0.1485  −0.2137  0.1466              −0.2192  0.1491  −0.2106  0.1477    Privacy/gov’t   concerns  −1.1895***  0.1667  −1.2241***  0.1622              −1.1891***  0.1666  −1.2231***  0.1621    Other  1.0547**  0.3261  1.0255**  0.3255              1.0549**  0.3259  1.0267**  0.3252  Doorstep concerns × effort                                Privacy × a   lot of effort  −0.5884*  0.2789  −0.5312†  0.2728              −0.5885*  0.2789  −0.5319†  0.2725    Privacy ×   minimum   effort  0.3183  0.1918  0.2601  0.1924              0.3172  0.1915  0.2581  0.1925    Busy × a lot   of effort  −0.0888  0.2736  −0.0890  0.2749              −0.0841  0.2758  −0.0785  0.2765    Busy ×   minimum   effort  0.2238  0.2096  0.2277  0.2095              0.2219  0.2114  0.2234  0.2112   Other × a lot of effort  1.0794†  0.6224  1.0477  0.6182              1.0746†  0.6244  1.0370  0.6194    Other ×   minimum   effort  −0.9002*  0.3610  −0.8777*  0.3593              −0.8964*  0.3617  −0.8693*  0.3597  Rapport (Phone)                                Personal   Visit                      0.0170  0.0428  0.0395  0.0412  Variable  Model 1   Model 2   Model A   Model B   Model C   Model D   Model E   Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Demographic characteristics                              Age group (32–65)                                18–32  0.3435***  0.0633  0.3389***  0.0634  0.3044***  0.0717  0.3319***  0.0742  0.3282***  0.0728  0.3425***  0.0640  0.3368***  0.0639    65 +  −0.2692***  0.0645  −0.2532***  0.0656  −0.1971**  0.0613  −0.2387***  0.0608  −0.2374***  0.0630  −0.2694***  0.0644  −0.2538***  0.0653  Gender (Male)                                Female  −0.0161  0.0426  −0.0229  0.0421  −0.0396  0.0359  −0.0388  0.0363  −0.0020  0.0427  −0.0152  0.0425  −0.0207  0.0419  Race (White)                                Non-white  0.0949  0.1264  0.0612  0.1260  −0.0781  0.1059  −0.0192  0.1056  0.0322  0.1155  0.0938  0.1269  0.0593  0.1264  Spanish interview (No)                                Yes  −0.4234  0.2923  −0.4452  0.2790  −0.4044  0.2930  −0.4271  0.3120  −0.4801  0.3118  −0.4287  0.2973  −0.4576  0.2846  Education group (HS grad)                                Less than HS  0.3097  0.1879  0.2926  0.1835  0.0142  0.1349  0.0186  0.1383  0.3317†  0.1838  0.3081  0.1885  0.2894  0.1844    Some college  0.4016**  0.1423  0.3739*  0.1389  −0.0986  0.1087  −0.0815  0.1129  0.4066**  0.1389  0.3996**  0.1442  0.3695*  0.1409    Associate’s   degree  −0.3321  0.2041  −0.3150  0.1993  −0.1109  0.1100  −0.1249  0.1145  −0.2858  0.2160  −0.3326  0.2036  −0.3160  0.1990    Bachelor’s   degree  −0.0654  0.2112  −0.0663  0.2082  0.0383  0.0719  0.0425  0.0742  −0.0230  0.2037  −0.0618  0.2127  −0.0581  0.2095    Advance   degree  −0.1747  0.2762  −0.1512  0.2773  0.1899  0.1215  0.1941  0.1235  −0.2738  0.2482  −0.1720  0.2773  −0.1452  0.2796  Home owner (Renter)                                Owner  −0.2767†  0.1453  −0.3233*  0.1436  −0.3612**  0.1277  −0.3589**  0.1314  −0.2703†  0.1397  −0.2754†  0.1448  −0.3198*  0.1433  Total expenditures (Log)  −0.0605  0.0799  −0.0083  0.0800  0.1025  0.0698  −0.0372  0.0754  −0.0419  0.0746  −0.0594  0.0802  −0.0065  0.0801  Income group                                Less than   $8,181  −0.2155***  0.0604          −0.4182***  0.0561  −0.4247***  0.0577  −0.2142**  0.0609      Income imputed (No)                                Yes  −0.1487**  0.0513                  −0.1481**  0.0510      Race × gender  −0.1874†  0.0956  −0.1953*  0.0923          −0.2268*  0.0930  −0.1873†  0.0957  −0.1946*  0.0923  Owner × education                                Less than HS  −0.4931*  0.2066  −0.4632*  0.1996          −0.4750*  0.2045  −0.4929*  0.2067  −0.4637*  0.2002    Some college  −0.6575***  0.1567  −0.6370***  0.1613          −0.6764***  0.1438  −0.6548***  0.1578  −0.6309***  0.1633    Associate’s   degree  0.3407†  0.2013  0.3402†  0.1986          0.2616  0.2090  0.3401†  0.2007  0.3387†  0.1978    Bachelor’s   degree  0.1385  0.2402  0.1398  0.2368          0.1057  0.2296  0.1358  0.2409  0.1335  0.2376   Advance degree  0.4713  0.2847  0.4226  0.2870          0.5867*  0.2583  0.4691  0.2854  0.4183  0.2890  Environmental features                              Region (Northeast)                                Midwest  0.2097†  0.1234  0.2111†  0.1163  0.2602*  0.1100  0.2457*  0.1196  0.2352†  0.1208  0.2098†  0.1232  0.2111†  0.1159    South  0.2451*  0.1190  0.2408*  0.1187  0.1841  0.1141  0.2017†  0.1149  0.2129†  0.1152  0.2446*  0.1189  0.2399*  0.1187    West  −0.2670*  0.1055  −0.2557*  0.1066  −0.2248*  0.0943  −0.2402**  0.1001  −0.2441*  0.1019  −0.2678*  0.1061  −0.2574*  0.1071  Urbanicity (Rural)                                Urban  −0.0268  0.0829  −0.0234  0.0824  −0.0253  0.0818  −0.0226  0.0833  −0.0249  0.0821  −0.0268  0.0829  −0.0233  0.0823  R attitude proxies                              Converted refusal (No)                                Yes  −0.0721  0.0756  −0.0838  0.0763              −0.0714  0.0753  −0.0818  0.0760  Effort (Moderate)                                A lot of effort  0.5454*  0.2081  0.6142**  0.2065              0.5418*  0.2101  0.6051**  0.2082    Bare   minimum   effort  −0.4879***  0.1365  −0.5721***  0.1371              −0.4842**  0.1387  −0.5626***  0.1389  Doorstep concerns (None)                                Too busy  −0.2207  0.1485  −0.2137  0.1466              −0.2192  0.1491  −0.2106  0.1477    Privacy/gov’t   concerns  −1.1895***  0.1667  −1.2241***  0.1622              −1.1891***  0.1666  −1.2231***  0.1621    Other  1.0547**  0.3261  1.0255**  0.3255              1.0549**  0.3259  1.0267**  0.3252  Doorstep concerns × effort                                Privacy × a   lot of effort  −0.5884*  0.2789  −0.5312†  0.2728              −0.5885*  0.2789  −0.5319†  0.2725    Privacy ×   minimum   effort  0.3183  0.1918  0.2601  0.1924              0.3172  0.1915  0.2581  0.1925    Busy × a lot   of effort  −0.0888  0.2736  −0.0890  0.2749              −0.0841  0.2758  −0.0785  0.2765    Busy ×   minimum   effort  0.2238  0.2096  0.2277  0.2095              0.2219  0.2114  0.2234  0.2112   Other × a lot of effort  1.0794†  0.6224  1.0477  0.6182              1.0746†  0.6244  1.0370  0.6194    Other ×   minimum   effort  −0.9002*  0.3610  −0.8777*  0.3593              −0.8964*  0.3617  −0.8693*  0.3597  Rapport (Phone)                                Personal   Visit                      0.0170  0.0428  0.0395  0.0412  Note.—†p < 0.10, *p < 0.05, **p < 0.01, ***p < 0.001. Table 8. Multivariate Logistic Models Predicting Consent-to-Link (Weighted) Variable  Model 1   Model 2   Model A   Model B   Model C   Model D   Model E   Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Estimate  SE  Demographic characteristics                              Age group (32–65)                                18–32  0.3435***  0.0633  0.3389***  0.0634  0.3044***  0.0717  0.3319***  0.0742  0.3282***  0.0728  0.3425***  0.0640  0.3368***  0.0639    65 +  −0.2692***  0.0645  −0.2532***  0.0656  −0.1971**  0.0613  −0.2387***  0.0608  −0.2374***  0.0630  −0.2694***  0.0644  −0.2538***  0.0653  Gender (Male)                                Female  −0.0161  0.0426  −0.0229  0.0421  −0.0396  0.0359  −0.0388  0.0363  −0.0020  0.0427  −0.0152  0.0425  −0.0207  0.0419  Race (White)                                Non-white  0.0949  0.1264  0.0612  0.1260  −0.0781  0.1059  −0.0192  0.1056  0.0322  0.1155  0.0938  0.1269  0.0593  0.1264  Spanish interview (No)                                Yes  −0.4234  0.2923  −0.4452  0.2790  −0.4044  0.2930  −0.4271  0.3120  −0.4801  0.3118  −0.4287  0.2973  −0.4576  0.2846  Education group (HS grad)                                Less than HS  0.3097  0.1879  0.2926  0.1835  0.0142  0.1349  0.0186  0.1383  0.3317†  0.1838  0.3081  0.1885  0.2894  0.1844    Some college  0.4016**  0.1423  0.3739*  0.1389  −0.0986  0.1087  −0.0815  0.1129  0.4066**  0.1389  0.3996**  0.1442  0.3695*  0.1409    Associate’s   degree  −0.3321  0.2041  −0.3150  0.1993  −0.1109  0.1100  −0.1249  0.1145  −0.2858  0.2160  −0.3326  0.2036  −0.3160  0.1990    Bachelor’s   degree  −0.0654  0.2112  −0.0663  0.2082  0.0383  0.0719  0.0425  0.0742  −0.0230  0.2037  −0.0618  0.2127  −0.0581  0.2095    Advance   degree  −0.1747  0.2762  −0.1512  0.2773  0.1899  0.1215  0.1941  0.1235  −0.2738  0.2482  −0.1720  0.2773  −0.1452  0.2796  Home owner (Renter)                                Owner  −0.2767†  0.1453  −0.3233*  0.1436  −0.3612**  0.1277  −0.3589**  0.1314  −0.2703†  0.1397  −0.2754†  0.1448  −0.3198*  0.1433  Total expenditures (Log)  −0.0605  0.0799  −0.0083  0.0800  0.1025  0.0698  −0.0372  0.0754  −0.0419  0.0746  −0.0594  0.0802  −0.0065  0.0801  Income group                                Less than   $8,181  −0.2155***  0.0604          −0.4182***  0.0561  −0.4247***  0.0577  −0.2142**  0.0609      Income imputed (No)                                Yes  −0.1487**  0.0513                  −0.1481**  0.0510      Race × gender  −0.1874†  0.0956  −0.1953*  0.0923          −0.2268*  0.0930  −0.1873†  0.0957  −0.1946*  0.0923  Owner × education                                Less than HS  −0.4931*  0.2066  −0.4632*  0.1996          −0.4750*  0.2045  −0.4929*  0.2067  −0.4637*  0.2002    Some college  −0.6575***  0.1567  −0.6370***  0.1613          −0.6764***  0.1438  −0.6548***  0.1578  −0.6309***  0.1633    Associate’s   degree  0.3407†  0.2013  0.3402†  0.1986          0.2616  0.2090  0.3401†  0.2007  0.3387†  0.1978    Bachelor’s   degree  0.1385  0.2402  0.1398  0.2368          0.1057  0.2296  0.1358  0.2409  0.1335  0.2376   Advance degree  0.4713  0.2847  0.4226  0.2870          0.5867*  0.2583  0.4691  0.2854  0.4183  0.2890  Environmental features                              Region (Northeast)                                Midwest  0.2097†  0.1234  0.2111†  0.1163  0.2602*  0.1100  0.2457*  0.1196  0.2352†  0.1208  0.2098†  0.1232  0.2111†  0.1159    South  0.2451*  0.1190  0.2408*  0.1187  0.1841  0.1141  0.2017†  0.1149  0.2129†  0.1152  0.2446*  0.1189  0.2399*  0.1187    West  −0.2670*  0.1055  −0.2557*  0.1066  −0.2248*  0.0943  −0.2402**  0.1001  −0.2441*  0.1019  −0.2678*  0.1061  −0.2574*  0.1071  Urbanicity (Rural)                                Urban  −0.0268  0.0829  −0.0234  0.0824  −0.0253  0.0818  −0.0226  0.0833  −0.0249  0.0821  −0.0268  0.0829  −0.0233  0.0823  R attitude proxies                              Converted refusal (No)                                Yes  −0.0721  0.0756  −0.0838  0.0763              −0.0714  0.0753  −0.0818  0.0760  Effort (Moderate)                                A lot of effort  0.5454*  0.2081  0.6142**  0.2065              0.5418*  0.2101  0.6051**  0.2082    Bare   minimum   effort  −0.4879***  0.1365  −0.5721***  0.1371              −0.4842**  0.1387  −0.5626***  0.1389  Doorstep concerns (None)                                Too busy  −0.2207  0.1485  −0.2137  0.1466              −0.2192  0.1491  −0.2106  0.1477    Privacy/gov’t   concerns  −1.1895***  0.1667  −1.2241***  0.1622              −1.1891***  0.1666  −1.2231***  0.1621    Other  1.0547**  0.3261  1.0255**  0.3255              1.0549**  0.3259  1.0267**  0.3252  Doorstep concerns × effort                                Privacy × a   lot of effort  −0.5884*  0.2789  −0.5312†  0.2728              −0.5885*  0.2789  −0.5319†  0.2725    Privacy ×   minimum   effort  0.3183  0.1918  0.2601  0.1924              0.3172  0.1915  0.2581  0.1925    Busy × a lot   of effort  −0.0888  0.2736  −0.0890  0.2749              −0.0841  0.2758  −0.0785  0.2765    Busy ×   minimum   effort  0.2238  0.2096  0.2277  0.2095              0.2219  0.2114  0.2234  0.2112   Other × a lot of effort  1.0794†  0.6224  1.0477  0.6182              1.0746†  0.6244  1.0370  0.6194    Other ×   minimum   effort  −0.9002*  0.3610  −0.8777*  0.3593              −0.8964*  0.3617  −0.8693*  0.3597  Rapport (Phone)                                Personal   Visit                      0.0170  0.0428  0.0395  0.0412  Variable  Model 1   Model 2   Model A   Model B   Model C   Model D   Model E   Estimate  SE  Estimate  SE  Estimate  SE&nb