Special Issue Paper
Received 23 June 2016 Published online 7 December 2016 in Wiley Online Library
(wileyonlinelibrary.com) DOI: 10.1002/mma.4256
MOS subject classiﬁcation: 62D05
An improved class of estimators in RR surveys
Maria Del Mar Rueda
, Beatriz Cobo and Antonio Arcos
Communicated by J. Vigo-Aguiar
This work proposes a general class of estimators for the population total of a sensitive variable using auxiliary information.
Under a general randomized response model, the optimal estimator in this class is derived. Design-based properties of
proposed estimators are obtained. A simulation study reﬂects the potential gains from the use of the proposed estimators
instead of the customary estimators. Copyright © 2016 John Wiley & Sons, Ltd.
Keywords: auxiliary information; randomized response technique; Horvitz–Thompson estimator
Linear estimation parameters in a population is performed through surveys. An example is the number of voters to a particular party
in an election poll.
In many surveys, it becomes necessary to probe into areas considered sensitive and potentially embarrassing. The validity of self-
reports of sensitive attitudes and behaviors suffers from the tendency of individuals to distort their responses towards their perception
of what is socially acceptable. As a consequence, studies self-report measures consistently underestimate the prevalence of undesirable
attitudes or behaviors and overestimate the prevalence of desirable attitudes or behaviors. In an attempt to reduce this bias, Warner
developed the randomized response technique (RRT)  . His idea spawned a vast volume of literature (e.g., [2–6]).
The authors of  and  have extended Warner’s model to the case where the responses to the sensitive question are quantitative
rather than a simple yes or no. The respondent selects, by means of a randomization device, one of the two questions: one being
the sensitive question, the other being unrelated. There are several difﬁculties that arise when using this unrelated question method
. These difﬁculties are no longer present in the scrambled randomized response method introduced by Eichhorn and Hayre .
In Eichhorn and Hayre model, each respondent scrambles their response y by multiplying it by a random variable S and then reveals
only the scrambled result z D yS to the interviewer; thus, the scrambled randomized response model maintains the privacy of the
respondents. Saha  discussed the use of scrambled responses based on both multiplicative and additive model, which involve the
respondent adding and multiplying the answer to the sensitive question by two random number. Bar-Lev, Bobovitch, and Boukai 
proposed a method that generalizes the Eichhorn and Hayre model, which introduces a design parameter controlled by the researcher
and used for randomizing the responses. Other important RR models are proposed by the authors of the literature [4, 13, 14].
Most research into RRT techniques deals exclusively with the interest variable and does not make explicit use of auxiliary variables in
the construction of estimators. Examples of these auxiliary variables in election polls could be sex, age, educational level, or taxes. Diana
and Perri  pointed out that in sampling practice, direct techniques for collecting information about non-sensitive characteristics
make massive use of auxiliary variables to improve sampling design and to achieve higher precision in population parameter estimates.
Nevertheless, very few procedures have been suggested to improve randomization technique performance using the supplementary
information. Regression estimators for scrambled variables are deﬁned in [16–19]. Tracy and Singh  introduced the calibration of
scrambled responses and ﬁnd the conditional bias and variance of the proposed estimator. Singh and Kim  proposed an empirical
log-likelihood estimator for estimating the population mean of a sensitive variable in the presence of an auxiliary variable. Diana and
Perri  discussed the use of auxiliary information to estimate the population mean of a sensitive variable when data are perturbed by
means of three scrambled response devices, namely, the additive, the multiplicative, and the mixed model. Horvitz, Shah, and Simmons
 proposed exponential-type estimators using one and two auxiliary variables.
From a mathematical point of view, a process of seeking an optimal estimator in a class of estimators for the total of sensitive
characteristic arises, under a general model for the scrambling response and in presence of additional information.
Faculty of Science, University of Granada, Avd. Fuente Nueva, 18071 Granada, Spain
Correspondence to: M. Rueda, Faculty of Science,University of Granada, Avd. Fuente Nueva, 18071 Granada, Spain.
Copyright © 2016 John Wiley & Sons, Ltd. Math. Meth. Appl. Sci. 2018, 41 2307–2318