Abstract
Fuzzy Inf. Eng. (2011) 4: 351-358 DOI 10.1007/s12543-011-0090-9 ORIGINAL ARTICLE A Fuzzy Bootstrap Test for the Mean with D -distance p,q Bahram Sadeghpour-Gildeh· Sedigheh Rahimpour Received: 30 January 2011/ Revised: 24 July 2011/ Accepted: 15 October 2011/ © Springer-Verlag Berlin Heidelberg and Fuzzy Information and Engineering Branch of the Operations Research Society of China Abstract In this paper, we consider the problem of testing a simple hypothesis about the mean of a fuzzy random variable. For this purpose, we take a distance between the sample mean and the mean in the null hypothesis as a test statistic. An asymptotic test about the fuzzy mean is obtained by using a central limit theorem. 2 2 The asymptotical distribution is ω -distribution. The ω -distribution is only known for special cases, thus we have considered random LR-fuzzy numbers. In the fuzzy concept, in addition to the existence of several versions of the central limit theorem, there is another practical disadvantage: The limit law is, in most cases, difﬁcult to handle. Therefore, the central limit theorem for fuzzy random variable does not seem to be a very useful tool to make inferences on the mean of fuzzy random variable. Thus we use the bootstrap technique. Finally, by means of a simulation study, we show that the bootstrap method is a powerful tool in the statistical hypothesis testing about the mean of fuzzy random variables. Keywords Bootstrap sampling · Fuzzy random variable · D -distance · LR-fuzzy p,q numbers 1. Introduction The concept of fuzzy sets was introduced by Zadeh [9] to describe non-statistical uncertainty (inexactness, vagueness). Fuzzy random variables deﬁned by Puri and Ralescu [7] deal with both kind of uncertainty: the randomness and the vagueness. One of the primary purposes of statistical inference is to test hypothesis. Testing hy- pothesis with fuzzy data was considered by Tanaka et al [8]. In particular, Korner [4], Gonzalez Rodriguez et al [3] and Colubi [1] have considered the problem of hypoth- esis testing about the mean of a fuzzy random variable. By means of central limit theorem, Korner [4] proposed an asymptotic test for the one sample problem about the mean value of fuzzy random variable. The results in Korner [4] are theoretically Bahram Sadeghpour-Gildeh () · Sedigheh Rahimpour Department of Statistics, Faculty of Mathematical Science, University of Mazandaran, 47416-95447, Babolsar, Iran email: sadeghpour@umz.ac.ir 352 Bahram Sadeghpour-Gildeh · Sedigheh Rahimpour (2011) more widely applicable, but they become unfeasible in practice because they are quite complex, involving unknown population parameters, and the sample sizes to obtain suitable results are too large. Nevertheless, in the classic concept there are other ways to consistently approximate the distribution of a statistic that could be extended to the fuzzy concept. An example is the bootstrap that was introduced by Efron [2]. Gonzalez Rodriguez et al [3] are considered bootstrap testing method about the mean value of fuzzy random variables. Section 2 contains some preliminary. In Sec- tion 3, we consider the problem of hypothesis testing about the mean value of fuzzy random variables. In Section 4, we have carried out a simulation study and ﬁnally, the obtained results are displayed in Section 5. 2. Preliminaries Deﬁnition 1 Let X be a universal set. Then a fuzzy set A of X is deﬁned by its mem- ˜ ˜ ˜ bership function A : X −→ [0, 1], where A(x) is the membership grade of x to A. ˜ ˜ Deﬁnition 2 The Supp(A) is called the support of A, deﬁned by ˜ ˜ S upp(A) = {x | x ∈ X, A(x) > 0}. (1) Deﬁnition 3 For each 0<α< 1, theα-level set of A is deﬁned by A = {x ∈ X | A(x) ≥ α}. (2) α-level set is a classic set. Deﬁnition 4 A fuzzy number is a fuzzy set of such that the following conditions are satisﬁed: ˜ ˜ (1) A is normal, i.e {x∈| A(x) = 1} is non empty. (2) A is convex, i.e∀ x , x ∈ Randλ ∈ [0, 1]: 1 2 ˜ ˜ ˜ A(λx + (1−λ)x ) ≥ min(A(x ), A(x )). (3) 1 2 1 2 (3) A is upper semi-continuous with compact support. According to the above deﬁnition, α-level set of a fuzzy number is a closed interval − + denoted by A = [A , A ], where α α − + ˜ ˜ A = in f{x∈ ; A(x) ≥ α}, A = sup{x∈ ; A(x) ≥ α}. (4) α α F() denotes the set of all fuzzy numbers. By Zadeh’s extension principle for ˜ ˜ each A, B ∈ F() andλ∈, ˜ ˜ (A⊕ B) = A + B = {a+ b|a ∈ A , b ∈ B }, (5) α α α α α (λ A) = λ.A = {λa|a ∈ A }. (6) α α α Fuzzy Inf. Eng. (2011) 4: 351-358 353 ˜ ˜ Deﬁnition 5 The D -distance between two fuzzy numbers A, B ∈ F() indexed by p,q parameters 1 ≤ p ≤∞ and 0 ≤ q ≤ 1, is a nonnegative function on F() × F() given as follows: 1 1 − − p + + p 1/p [(1− q) |A − B | dα+ q |A − B | dα] , p < ∞, α α α α 0 0 ˜ ˜ D (A, B) = (7) p,q ⎪ − − + + ⎪ (1− q) sup (|A − B |)+ q inf (|A − B |), p = ∞. ⎩ α α α α 0<α≤1 0<α≤1 In this paper, we suppose that p = 2 and q = 1/2. Let (Ω, A, P) be a probability space. Deﬁnition 6 A mapping X : Ω → F() is said to be a fuzzy random variable if and only if {(ω, x): x ∈ X (ω)}∈A×B, (8) where B denote the σ-ﬁeld of Borel set in . ˜ ˜ Deﬁnition 7 If X be a fuzzy random variable, the expected value of X is the unique ˜ ˜ fuzzy subset of (if it exist), denoted by E(X), such that for all α ∈ [0, 1], we have − + E(X) = E(X ) = [E(X ), E(X )]. (9) α α Deﬁnition 8 If E(sup|x| ) < ∞), the variance of X on the basis of D -distance is p,q x∈χ deﬁned by ˜ ˜ DVar(X) = E([D (X,μ ˜ )] ), (10) 2,q ˜ whereχ is the closure of the α-level set. 3. A Hypothesis Test for the Mean of Fuzzy Random Variable ˜ ˜ ˜ Let X be a fuzzy random variable such that E(sup|x|) < ∞ and let X ,···, X be a 1 n x∈χ random sample obtained from X. The sample fuzzy mean value is given by ˜ ˜ X ⊕···⊕ X 1 n X = . (11) Theorem 1 The sample fuzzy mean value is an unbiased and consistent estimator of ˜ ˜ the fuzzy parameter E(X). [1] Deﬁnition 9 The covariance operator C of X is deﬁned by ∗ d−1 f (C g) = E{ f (X − E(X ))g(X − E(X ))}, f, g ∈ L (S × (0, 1]), (12) X α α α α d−1 d where S = {u∈ | u = 1}. Deﬁnition 10 Distance between fuzzy number A and non-fuzzy number{0} is deﬁned by A := D 1 (A,{0}). (13) 2, 2 354 Bahram Sadeghpour-Gildeh · Sedigheh Rahimpour (2011) ˜ ˜ ˜ Theorem 2 Let X , X ,···, X be a fuzzy sample with E( X ) < ∞. Then 1 2 n 1 2 ˜ l 2 ¯ ˜ ˜ nD (X , E(X )) −→ λ ξ , (14) 1 n 1 k 2, k=1 where ξ are independent chi-square variables with one degree of freedom and set {λ } are the eigenvalues of the covariance operator C of X . [4] k X 1 2 2 2 The random variable Y = λ ξ is called ω -distributed. The ω -distribution is k=1 known only for special cases, i.e, for special sequences of eigenvaluesλ ,λ ,···. 1 2 ˜ ˜ Now, an asymptotical test for testing the null hypothesis H : E(X ) = μ ˜ against 0 1 0 ˜ ˜ the alternative H : E(X ) μ ˜ can be formulated. 1 1 0 We can expressed these hypotheses in terms of D : 2,1/2 ˜ ˜ H : D (E(X ),μ ˜ ) = 0, (15) 0 1 1 0 2, ˜ ˜ H : D (E(X ),μ ˜ ) > 0. (16) 1 1 1 0 2, Thus, the test reject H if T = nD (X ,μ ˜ ) > t , (17) n 1 n 0 1−α 2, where t is the (1−α)-quantile of the ω -distribution. 1−α Or equivalently, the test reject H if p = P(Y ≥ t ) ≤ α, (18) asymp obs where t is the observed value of test statistic T . obs n Remark 1 A problem with above test is the computation of the critical point t 1−α or equivalently, the calculus of the asymptotic p-value p . This is difﬁcult for asymp two reasons: one reason is, to obtain the eigenvalues of the covariance operator C is not an easy question, the other reason is that, in most cases, the operator C is unknown since it depends on some population parameters. To appreciate these facts, we consider the class of LR-fuzzy numbers. Deﬁnition 11 The membership function of an LR-fuzzy number A = (m, l, r) is LR m− x ⎪ L ( ), x < m, A(x) = 1, x = m, (19) ⎪ x− m R ( ), x > m. where L, R : −→ [0, 1] are ﬁxed left-continuous and non-increasing functions with L(0) = R(0) = 1. The functions L and R are called left and right shape functions, m is the modal point and l, r ≥ 0 are the left and right spreads, respectively, of the LR-fuzzy number. Fuzzy Inf. Eng. (2011) 4: 351-358 355 ˜ ˜ The D -distance of two LR-fuzzy numbers A = (m , l , r ) and B = (m , l , 2,1/2 A A A LR B B r ) is B LR 2 2 2 2 ˜ ˜ D (A, B) = (m − m ) + R (r − r ) + L (l − l ) 1 A B 2 A B 2 A B 2, +2(m − m )(R (r − r )− L (l − l )), (20) A B 1 A B 1 A B where (−1) (−1) L (α) = sup{x∈|L(x) ≥ α}, R (α) = sup{x∈| R(x) ≥ α}, 1 1 1 1 (−1) (−1) 2 L = L (α) dα, L = (L (α)) dα, (21) 1 2 2 2 0 0 and R , R are similarly deﬁned. 1 2 Let m, l and r be three random variables with P(l ≥ 0) = P(r ≥ 0) = 1, a random LR-fuzzy number is deﬁned by X = (m, l, r) . To ensure that E( X ) < ∞, the ran- LR 2 2 2 dom variables m, l, r must have ﬁnite second order moment, that is, E(m ), E(l ), E(r ) < ∞ and L , R < ∞. Next theorem, due to [1], gives a way to calculate the eigenval- 2 2 ues of the covariance operator of a random LR-fuzzy number. Theorem 3 Let X = (m, l, r) be a random LR-fuzzy number with E( X ) < ∞. LR Then the eigenvalues of the covariance operator C are equal to the eigenvalues of the matrix ⎛ ⎞ C − L C + R C L C − L C R C + R C mm 1 lm 1 rm 2 lm 1 mm 1 mm 2 rm ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ C − L C + R C L C − L C R C + R C ⎟ K = , (22) X ⎜ lm 1 ll 1 rl 2 ll 1 lm 1 lm 2 rl ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ C − L C + R C L C − L C R C + R C rm 1 rl 1 rr 2 rl 1 rm 1 rm 2 rr where for z, y∈{m, l, r}:C = E(z− Ez)(y− Ey). zy To calculate the critical point t or asymptotic p-value p , we may proceed 1−α asymp as follows: (i) Estimate K from the data. ˆ ˆ ˆ ˆ ˆ (ii) Approximate the distribution of Y by that of Y = λξ , where λ ,λ ,λ are i 1 2 3 i=1 the eigenvalues of K . (iii) Approximate the distribution of Y by some method as (Rao and Scott [5], Pardo and Zografos [6]): ˆ ˆ ˆ ˆ (a) λχ , where r = rank(K ) and λ = λ ; X k k=1 ˆ ˆ ˆ ˆ (b) λ χ , where r is as before andλ ≥ λ ≥ λ ; (1) (1) (2) (3) ˆ ˆ (λ −λ) (i) i=1 2 2 2 ˆ ˆ (c) λ (1+θ )χ , where r,λ are as before ν = andθ = . 1+θ rλ 356 Bahram Sadeghpour-Gildeh · Sedigheh Rahimpour (2011) 3.1. A Fuzzy Bootstrap Test To decide when rejecting H , we need to know the distribution of T under the null 0 n hypothesis (null distribution of T ) which, in general, is quite difﬁcult and so, in most cases, one has to approximate it. A way to do this is by considering its limiting null distribution, but we have seen that this is not operational. But the bootstrap can be employed to consistently approximate the null distribution of T . ˜ ˜ ˜ Let X , X ,···, X be independent and identically distributed (i.i.d.) fuzzy random 1 2 n ∗ ∗ ∗ ˜ ˜ ˜ ˜ ˜ ˜ ˜ variable. Given X , X ,···, X , let X = (X , X ,···, X ) be a bootstrap sample, that 1 2 n 1 2 ∗ ∗ ∗ ˜ ˜ is, X , X ,···, X are i.i.d fuzzy random variable such that 1 2 ˜ ˜ P (X = X ) = , j = 1, 2,···, n, (23) ∗ j where P denotes a bootstrap probability law, that is, the conditional probability given ˜ ˜ ˜ the original sample X , X ,···, X . 1 2 n ∗ ∗ ∗ ∗ ˜ ˜ X ⊕...⊕X ∗ 2 ˜ ˜ ˜ 1 ¯ ¯ ¯ Let T = nD (X , X ), with X = . We call the conditional distribution 1 n n n n 2, ˜ ˜ ˜ of T given X , X ,···, X , the null bootstrap distribution of T . 1 2 n n For testing H we consider the following test: reject H if 0 0 T > t , (24) 1−α where t is the (1−α)-quantile of the null bootstrap distribution of T , or equiva- 1−α lently, if p = P (T ≥ t ) ≤ α, (25) boot ∗ obs where p is the bootstrap p-value. boot In general, the values t and p are not known, but they can be approximated boot 1−α by simulation. 4. Experimental Results To compare the ﬁnite sample performance of these approximations, we have carried out a simulation experiment. In this experiment, we have considered random LR- fuzzy numbers, as we see in Section 3, in this particular case, it is possible to estimate the eigenvalues of the covariance operator. We have taken L(x) = R(x) = max{0, 1− x}, m, l, r independent such that m ∼ N(0, 1) and l, r ∼ U (0, 1). We have generated 10000 random samples of size n. For each sample, we have calculated the observed value of the test statistic T for testing the null hypothesis 1 1 H : E = (0, , ) . (26) 0 LR 2 2 We generate B=1000 bootstrap samples. To approximate the p-value, we used the ˆ ˆ three approximations to distribution of Y = λ ξ at the end of Section 3. This way k=1 we have obtained three estimate of the asymptotic p-value refering to them as asymp (a), asymp (b) and asymp (c). To see the goodness of these approximations, we have calculated the exact asymp- totic p-value. We refer to it as asymp. We have considered three nominal sizes: 0.01, Fuzzy Inf. Eng. (2011) 4: 351-358 357 0.05 and 0.1. For each approximation, we have calculated the relative number of p- values less or equal than the nominal size, that is, the simulated size. We have also obtained the mean and variance of the p-values. Since H is true, if the considered approximations were exact, then the calculated p-value would be a random sample from a uniform distribution on the interval (0,1). The obtained results are displayed in Table 1. Table 1: Simulated size, mean and variance for testing H . n method α = 0.1 α = 0.05 α = 0.01 mean variance asymp 0.0952 0.0467 0.008 0.4976 0.08118 boot 0.0884 0.044 0.0064 0.5049 0.07244 5 asymp a 0.1176 0.0614 0.0119 0.5518 0.10782 asymp b 0.0057 0.0011 0 0.7519 0.05925 asymp c 0.1206 0.0505 0.004 0.5065 0.09748 asymp 0.0943 0.0469 0.010 0.49276 0.07988 boot 0.1389 0.0884 0.0353 0.4813 0.0942 10 asymp a 0.1249 0.07454 0.022 0.5819 0.1154 asymp b 0.01 0.00231 0 0.7892 0.05691 asymp c 0.115 0.0614 0.0122 0.5373 0.1019 asymp 0.0923 0.0437 0.0073 0.5038 0.07932 boot 0.0892 0.052 0.0132 0.4896 0.08774 20 asymp a 0.1344 0.09114 0.0353 0.5908 0.11982 asymp b 0.0088 0.00291 0.0003 0.8038 0.0546 asymp c 0.1163 0.0609 0.0138 0.5467 0.1008 asymp 0.1005 0.05137 0.01078 0.496 0.081078 boot 0.1177 0.0654 0.0175 0.4957 0.08875 30 asymp a 0.143 0.10174 0.04155 0.5928 0.1228 asymp b 0.0129 0.00531 0.0006 0.80599 0.0556 asymp c 0.1211 0.07435 0.0186 0.54595 0.1023 asymp 0.0959 0.0445 0.007 0.50246 0.0802 boot 0.11034 0.0582 0.0141 0.49759 0.08534 40 asymp a 0.142 0.09764 0.0451 0.6013 0.1239 asymp b 0.0106 0.0041 0.0005 0.8108 0.0562 asymp c 0.1188 0.071 0.0194 0.556 0.1021 asymp 0.1074 0.0573 0.012 0.489 0.0813 boot 0.109 0.0581 0.0157 0.4924 0.086 50 asymp a 0.1485 0.10284 0.0468 0.5973 0.1251 asymp b 0.0126 0.00581 0.0009 0.811 0.0562 asymp c 0.1292 0.072 0.0243 0.5461 0.1027 358 Bahram Sadeghpour-Gildeh · Sedigheh Rahimpour (2011) 5. Conclusion In Section 2, we have shown that the test rejects the null hypothesis for large values of D -distance between the sample mean and mean in the null hypothesis. To ob- p,q tain the critical region of the test, we must approximate the null distribution of the considered test statistic. We have identiﬁed some operational problems associated with the asymptotic approximation in Theorem 2. The obtained results of simulation study show that the bootstrap behaves much better than approximations asymp (a), asymp (b) and asymp (c). The size of the asymptotic approximation (a) is always much larger than the nominal level but the size of the asymptotic approximation (b) is always much smaller than the nominal level. From among the considered asymp- totic approximations, the asymptotic approximation (c) is the one having sizes, mean and variance closest to the ideal values. Nevertheless, its behavior is poorer than the bootstrap. Moreover, in contrast to asymptotic approximation, the bootstrap approxi- mation can be easily implemented. Acknowledgments The authors would like to express their sincere thanks to the referees for their valuable comments and suggestions. References 1. Colubi A (2009) Statistical inference about the means of fuzzy random variables:applications to the analysis of fuzzy and real-valued data. Fuzzy Sets and Systems 160: 344-356 2. Efron B (1979) Bootstrap methods: Another look at the jackknife. Ann. Statist 7: 1-26 3. Gonzalez Rodriguez G, Montenegro M, Colubi A, Angeles Gil M (2004) Bootstrap techniques and fuzzy random variables: Synergy in hypothesis testing with fuzzy data. Fuzzy Sets and Systems 157: 2608-2613 4. Korner R (2000) An asymptotic φ-tests for the expectation of random fuzzy variables. J. Statist. Plann. Inference 83: 331-346 5. Rao R N K, Scott A J (1981) The analysis of categorical data from complex sample surveys: chi- squared tests for goodness of ﬁt and independence in two-way tables. J. Amer. Statist. Assoc.76: 221-230 6. Pardo L, Zografos K (2000) Goodness of ﬁt tests with misclassiﬁed data based on φ-divergences. Biometrical Journal 42: 223-237 7. Puri M L, Ralescu D A (1986) Fuzzy random variables. J. Math. Anal.Appl.114: 409-422 8. Tanaka H, Okuda T, Asai K (1965) Fuzzy information and decision in statistical model. Advances in Fuzzy Sets Theory and Applications, North-Holland, Amsterdam: 303-320
Journal
Fuzzy Information and Engineering
– Taylor & Francis
Published: Dec 1, 2011
Keywords: Bootstrap sampling; Fuzzy random variable; D p, q -distance; LR -fuzzy numbers