Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Relationship between stochastic inequalities and some classical mathematical inequalities

Relationship between stochastic inequalities and some classical mathematical inequalities School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332-0160, USA e-mail: tong @ math.gatech.edu (Received 2 May 1996) The notions of association and dependence of random variables, rearrangements, and heterogeneity via majorization ordering have proven to be most useful for deriving stochastic inequalities. In this survey article we first show that these notions are closely related to three basic inequalities in classical mathematical analysis: Chebyshev’s inequality, the Hardy-Littlewood-P61ya rearrangement inequality and Schur functions. We then provide a brief review of some of the recent results in this area. An overall objective is to illustrate that classical mathematical inequalities of this type play a central role in the developments of stochastic inequalities. Keywords: stochastic inequalities; association; rearrangements; majorization and Schur functions. 1991 AMS Subject Classifications: Primary 60E15, Secondary 62H99. 1 INTRODUCTION As noted by P61ya [21], "Inequalities play a role in most branches of mathematics and have different applications". This is certainly true in the area of probability and statistics. For example, the celebrated Chebyshev, Markov and Kolmogorov inequalities are well-known and can be found in many probability books. In the theory of estimation, the Cram6r-Rao inequality and its generalizations provide lower bounds on the variances of a large class *Research supported in part by NSA Grant No. MDA904-94-H-2032. of estimators. In statistical hypotheses testing, the Neyman-Pearson Lemma directly involves inequalities. It appears that the developments of many ofthe stochastic inequalities originated from the concepts of certain fundamental inequalities in mathematics. As a result, mathematical inequalities of this type have made a strong impact on stochastic inequalities. In this survey paper, we discuss the influence of three basic inequalities in classical mathematical analysis, andshow how certain stochastic inequalities are closely related to those classical results. Specifically, we explain how and why the concepts of (1) Chebyshev’s (other) inequality, (2) the Hardy-Littlewood-P61ya (HLP) rearrangement inequality and (3) the Schur functions are related to, respectively, (I t) inequalities via association and dependence of random variables, (2t) stochastic inequalities via arrangement increasing functions, and (3 t) stochastic inequalities via majorization. Those results are described and discussed in Sections 2, 3 and 4, respectively. For reader’s convenience, each of the three classical results is restated at the beginning of a section. Some related stochastic inequalities, which have been obtained during the past two decades, are then stated with selected applications in Section 5. For simplicity we assume that all functions and subsets involved are Borel-measurable and integrable. Nondecreasing (nonincreasing) functions will be called increasing (decreasing). Further, we note that the results described in this paper are for illustrative purposes only, hence they are neither complete nor exclusive. For additional results on these topics, the reader is referred to the bibliographies contained in the references listed at the end of this paper. The proofs of some of the results can also be found there. 2 CHEBYSHEV’S INEQUALITY, ASSOCIATION, AND STOCHASTIC DEPENDENCE We first state a classical result of Chebyshev; a convenient reference for this result is Hardy et al. ([9], p. 43). For notational consistency, the statement given here is in a modified version and is not in its original form. TIJEORM 2.1 Let G1, G2 [0, 1] IR be two real-valued functions. they are both increasing, then the inequality fo holds. G (u)G2(u)du Ifo If G (u)du lifO G2(u)du (2.1) Note that in Theorem 2.1 if the word "increasing" is replaced by "decreasing" then, by replacing Gj with -Gj (j 1, 2) in (2.1), the same conclusion also holds. When compared with his inequality that provides bounds for the tail probabilities of a distribution via the first two moments, this result seems to be less well-known. Thus Theorem 2.1 is sometimes referred to as "Chebyshev’s other inequality" in the literature. If we interpret the integrals as expectations of random variables, then this result yields a corresponding stochastic inequality in the following sense: Consider a random variable X with distribution function F (x), and let (gl (X), g2 (X)) be a two-dimensional random vector such that g l, g2 are real-valued functions. By letting u F(x) and Gj gj F -1 (j 1, 2) in (2.1), we immediately obtain THEOREM 2.1 t. If gl, g2 are both increasing or both decreasing, then E[gl (X)g2(X)] >_ [E(gl (X))][E(g2(X))] holds; or equivalently, Corr (gl (X), g2(X)) >_ 0, where E(.) denotes expectation and "Corr" stands for the correlation coefficient. Intuitively speaking, Theorem 2.1 states that if gl and g2 are both increasing or both decreasing, then gl(X) and g2(X) tend to take larger values together and smaller values together. Thus their correlation coefficient is nonnegative. The theorem then involves positive dependence of random variables which are monotone functions of a common random variable. A question of interest is whether it can be generalized to monotone functions of several random variables. This question can be answered by studying the following concept of association of random variables, which was first considered by Esary, et al. [8]" DEFINITION 2.2. For n > 1 the random variables X1 Xn are said to be associated, or the set of random variables {X1 Xn} is said to be associated, if for all given real-valued functions g l, g2 that are increasing in each component when the other components are held fixed, the inequality =1 gj(X) > j=l E(gj(X)) holds, or equivalently, Corr (g (X), g2(X)) > 0, where X (2.2) et al. Xn ). [8] proved the following theorem: Esary, THO 2.3. (a) A set consisting of a single random variable is a set of associated random variables. (b) Independent random variables are associated random variables. (c) A subset of a set of associated random variablesforms a set ofassociated random variables. (d) Increasingfunctions of associated random variables are associated random variables. From a historical viewpoint, the Esary-Proschan-Walkup paper was motivated by a research problem in application and was not a direct outgrowth of the Chebyshev inequality. But, nevertheless, the notion is clearly a generalization of that in Theorem 2.1. In fact, Theorem 2.3(a) is equivalent to Theorem 2.1. A repeated application of Theorem 2.3 yields the following multivariate probability inequality. Note that, in particular, the result applies to independent random variables. FACT 2.4. For k >_ 2 let gl gk IRn -+ IR be increasing. IfX1 are associated random variables, then Xn kj--1 (gj(X) >_ .j) > P (gj(X) _> )j) P (gj(X) >_ .j) (2.3) >-- I-I P [gJ (X) >_ ;j] j=l holds for all U < k and arbitrary but fixed real numbers ;1, Fact 2.4 has a special application to stochastic processes with independent increments: FACT 2.5. Let S {Yt T} be a stochastic process of independent increments, where the parameter space T is either discrete or continuous. For k >_ 2 let gl IRn IR be increasing functions. Then gk arbitrary but fixed tl < < tn in T the set of random variables for {gl (Yt gk(Yti Yt,) Yr,)} is associated. Thus, when substituting (Ytl Yt,) for X in (2.3), the inequalities hold for all U < k and all ,1 ,k. Fact 2.5 has certain applications in boundary-crossing type of problems, and it represents a generalization of some earlier results. For example, Robbins [23] previously obtained the last inequality in (2.3) for k n and k) via a direct verification. His result now Ytn) Ytj (j 1 gj (Ytl follows from Fact 2.5 as a special case. The notion of association and monotone transformations of random variables arise in many problems in probability and statistics. For an extensive application in reliability theory, especially on reliability bounds for coherent systems, see Barlow and Proschan 1, Chapter 2]. In multivariate statistical analysis, Fact 2.4 is often applied to reduce the dimensionality of the joint distribution of random variables. In this area there are other related notions for defining positive dependence of random variables. A notion that is stronger than association is the multivariate totally-positive-of-order-2 (MTP2) property of the joint probability density function (p.d.f.). For a comprehensive treatment of TP2 functions and related results, see Karlin 14]. For a description of the orderings of the notions of positive dependence, see Barlow and Proschan [1, Chapter 5] and Tong [28, Chapter 5]. It is known that for the multivariate normal distribution all of these notions are equivalent. As a result, many special results for this distribution have been obtained. For example, Pitt’s [20] result states that if X (X1 Xn) is distributed according to a multivariate normal distribution, then X1 Xn are associated if and only if all of the simple correlation coefficients are nonnegative. 3 THE HLP REARRANGEMENT INEQUALITY AND ARRANGEMENT INCREASING (AI) FUNCTIONS The HLP rearrangement inequality, due to Hardy et al. [9, Chapter 10], is an algebraic inequality for inner products. For n > 2 let a an) (al and b (b bn) be two real vectors. Their inner product is defined to n be ab r Yi=I aibi. The problem of interest is to find the maximum and minimum through all permutations of the components of a and b. The HLP rearrangement inequality states that THEOREM 3.1. Let (a[1] a[n]) and (a() a(n)) be two vectors ob> a[n] tained from rearranging the components of a such that a[] > and a( <_ <_ a(n), and let (bill,..., bin]), (b(1) be defined b(n)) similarly. Then for all a, b the inequalities a(i)b[i] < aibi <_ a(i)b(i) hold. Thus, the inner product of a, b is minimized (maximized) over all possible permutations when their components are completely reversely ordered (similarly ordered). < an already Without loss of generality, let us assume that al < a2 < holds. We shall denote such a vector by the symbol a ’. If the components of b are not in an ascending order, then there exist two integers r < s such that br > bs. Let bl denote the vector obtained from b by interchanging br and bs in b. We use the symbol b bl to mean that br and bs are now in an ascending order in the new vector bl. If b* b) is another vector such that (b there exists N vectors bl bN satisfying b bl <... < bN b*, then we can change b into b*, in a finite number of rearrangements by interchanging two components at a time in this fashion. In this case we write b b*. Adopting such an approach, Sobel [26] obtained the following generalization of Theorem 3.1: P THEOREM 3.1 t. Let a, b and b* be n-dimensional real vectors. then (a ’)b T < (a ’)(b*) T. If b P b*, In a 1977 paper, Hollander, Proschan and Sethuraman studied functions that are decreasing in transposition. Subsequently, such functions are called "arrangement increasing" in the Marshall-Olkin 17, Section 6.F) book. The definition of such a function is given below: DEFINITION 3.2. b(a, b) IRn IRn IR is said to be an arrangement increasing (AI) function of (a, b) if (i) b (a, b) b (’(a), ’(b)) for (yr every vector n), Zrn) which is a permutation of (1 where yr(al an) ar); (ii) b b* implies b(a ’, b) < (arl q(a b*). P With this definition, Theorem 3.1 essentially says that b(a, b) ab T is an AI function. When applied to probability and statistics, the problem of interest is to develop some basic results for showing that certain functions are AI, thus to obtain probability inequalities via rearrangements. Hollander et al. [10] proved a fundamental preservation theorem and gave many such results. In particular, they proved a theorem that involves the ranks and the parameters of random variables from several populations. Let .T" {f (x, 0) 0 6 f2} denote a family of univariate p.d.f.’s while fl C IR is the parameter space. f" is said to possess the monotone likelihood ratio (MLR) property if, for all 01 < 02 in fl the ratio f (x, 02)/f (x, 01) is an increasing function of x. This property represents a notion of stochastic ordering of random variables in the sense that Eo g(X) is increasing in 0 for all increasing functions g. Thus a random variable X associated with a larger 0 is stochastically larger in this sense. Now, let X1 Xn be n independent random variables with p.d.f.’s n let Ri denote the rank f (x, 01),..., f (x, On), respectively. For 1 of Xi in the combined sample (X1 Xn). Then, intuitively speaking, the rank of a random variable with a larger 0 value tends to be higher. This is affirmed in the following theorem due to Hollander et al. [10]: TrtEOREM 3.3. then If.T" has the MLR property and X1 Xn are independent, rn] 4(0, r) Po[R r is an Al function of O (01 On) and r Rn (rl rn). Theorem 3.3 has many useful applications in rank order statistics and nonparametrics. For details, see Hollander et al. [10] and the related references. Other stochastic applications of arrangement functions include statistical inference problems based on order statistics. In particular, the ranking and selection problems concern the selection of the populations associated with the larger parameters. By applying Theorem 3.1 it can be shown algebraically that if the family of p.d.f.’s, .T’, is an exponential family, then the corresponding likelihood function is an AI function of (0, x); thus it is maximized when the parameters 0 On and the observations x Xn are similarly ordered. This property was applied extensively by Bechhofer, Kiefer and Sobel in their 1968 monograph. 4 MAJORIZATION, SCHUR FUNCTIONS, AND STOCHASTIC INEQUALITIES The notion of majorization concerns the heterogeneity of the elements of a real vector. Let a (al bn) be two real vectors and an), b (bl the a[i]’s, b[i]’s be defined as in Theorem 3.1. m DEFINITION 4.1. a is said to majorize b, in symbols a >- b, if a[i] >_ n 1 and equality holds for rn n. b[i] holds for rn 1, This definition provides a partial ordering of the "heterogeneity" of the ) components of a and b. In particular, it is easy to see that (i) a >- (h holds for all h where h is the arithmetic mean, (ii) if ai > 0 (i 1 n), then (i=ln ai ,0 0) >- a. It is known that: n n THEOREM 4.2. (a) a >- b holds iff 1-Ii= h (ai) >_ IIi= h (bi) holds for all aP for some n n continuous convex functions h. (b) a >- b holds iff b doubly stochastic matrix P. Definition 4.1 is closely related to the following definition of Schur-convex and Schur-concave functions: IR is said to be a Schur-convex (Schur-concave) function if a >- b implies (a) >_ (b) ((a) <_ (b)). DEFINITION 4.3. The basic ideas of majorization and Schur functions have played a role in classical mathematical analysis. In their monograph, Marshall and Olkin 17] gave a comprehensive and complete treatment of this topic, including its historical developments, basic theory and related inequalities and various applications that were known before 1979. In the rest of this section we provide a review of some existing stochastic inequalities derived via majorization. An earlier result, due to Marshall and Olkin 16], can be stated in the form of an integral inequality: IRn THEOREM 4.4. If f and g are Schur-concave functions defined on IRn, then the function defined on IRn by [f(O) is a Schur-concave function. flRn g(O x)f(x)dx By using the property that (0 x) >- (0" x*) iff (x 0) >- (x* 0"), and by letting g be the indicator function of a Schur-concave subset in IRn, Theorem 4.4 implies the following result given in Marshall and Olkin 16]: FACT 4.5. If the p.d.f, f(x) of an n-dimensional random vector X Xn) is Schur-concave, then the distribution function F(a) (X1 an). P[X1 < al Xn < an] is a Schur-concave function of a (al Proschan and Sethuraman [22] applied the results in majorization theory to study stochastic majorization and provided the following definition: DEFINITION 4.6. A random vector X (X1,..., Xn) is said to stochastically majorize V (Y1 Yn) if E((X)) >_ (<)E(gt(V)) holds for all Schur-convex (Schur-concave) functions They then obtained several useful theorems via this stochastic majorization ordering. One such result deals with the probability contents of a Schurconcave set for nonnegative random variables with TP2 and semigroup properties. A convenient reference of that theorem and its applications is Marshall and Olkin [17, p. 101). Hollander et al. [10] also showed that if t IRn )< IRn IR is of the form (a, b) g(a b) for some function IR, then q is AI iff g(x) is a Schur-concave function of x. g IRn This result yields rearrangement inequalities via majorization for location parameter families in statistics. 5 SOME RECENT DEVELOPMENTS There are many recent results in stochastic inequalities that have been derived through association of random variables, AI functions and majorization theory. Due to space limitations, we describe only a few of them in this section. Interested readers may find additional results from the bibliographies contained in the references given in this paper. Positive Association and Negative Association The random variables that satisfy Definition 2.2 tend to take larger values together and smaller values together. Thus they may be called "positively associated". Many useful inequalities derived via this notion have become available after the publication of the Esary et al. [8] paper, and some of them are originated from applications. A comprehensive reference on their applications in reliability theory can be found in the Barlow-Proschan [1] book. More recent applications in this area can be found in the review article by Boland et al. [6]. Some of the recent results and their selected applications in biology and medicine, business and economics, operations research (including queubing theory), and statistical inference can be found in Block et al. [3], Shaked and Tong [25] and Shaked and Shanthikumar [24]. The concept of negative association was introduced by Block et al. [4], Joag-Dev and Proschan [11 and others. There have been different forms of the definition of negative association and a convenient one is the following (Joag-Dev and Proschan 11 ]): DEFINITION 5.1. Random variables X1,..., Xn are said to be negatively associated iffor all disjoint subsets C1 {rl, Sm,} rm} and C2 {Sl of 1 n and all increasing functions g ]Rm -+ IR and g2 IRm’ IR the inequality E [gl E [gl (Srl (Srl Xsm,)] Srm) g2 (Xs1 E [g2 (Xs, Ssrn,)] Xrm)] holds. It follows immediately from Definition 5.1 that if X 1,..., negatively associated, then Xn are {Xri >_ Zi}, {Xs >_ il <e A{Xri il AIXsi (i for all i and t Certain applications of negative association of random variables to specific distributions have been studied in the literature. For example, Block et al. [4] proved that if the joint distribution of X1 Xn is multinomial, then they are negatively associated. 5.2 Stochastic Arrangement Increasing and Multivariate Arrangement Increasing Functions Certain generalizations of the notion of arrangement increasing functions have been made recently for deriving new multivariate probability inequalities. For example, Boland et al. [6] applied the fundamental preservation theorem in Hollander et al. 10] to obtain the following result: THEOREM 5.2. Assume that (X1 Xn) is a random vector with a p.d.f. that is permutation invariant. Let h 1, h2 be Al functions on ]Rn X IRn, and let g l, g2 IR --+ ]R be increasing. Then q(a, b) is an Al function E[gl (hi (a, X))g2(h2(X, b))] of (a, b) e ]Rn the distribution of X. IRn, where the expectation is taken over By applying this result to permutation invariant distributions and to families of probability distributions with AI density functions, many new results were given in Boland et al. [6]. For example, Theorem 5.2 immediately implies that if f (x) is permutation invariant, then P[al <_ X1 <_ bl an <_ Xn < bn] is an AI function of (a, b). (This is an earlier unpublished result of Boland.) For a convenient reference of these results, see Chapter 16 of Pe6arc et al. 19]. Another set of new results on this topic were given by Boland and Proschan [5] through multivariate generalization of AI functions. They provided a useful definition of multivariate AI functions of the form 4 ]Rn )< IRn x x IRn IR, proved a main theorem, showed that many classical results are special cases of the main theorem, and then existing applied it to derive new results in multivariate probability inequalities. For details, see Boland and Proschan [5] or Chapter 17 of Pearc et al. [19]. 5.3 Some Recent Results and Generalizations of MajorizationRelated Inequalities In the area of majorization-related stochastic inequalities, many new results have become available since the publication of the Marshall-Olkin 17] book. The following is a description of some of them: 5.3 (A) Generalized Majorization and Multivariate Majorization There exist several generalizations of the notion of majorization, including multivariate majorization, majorization orderings for continuous functions, and other related orderings. Three types of multivariate majorization were already treated in Chapt6r 15 of Marshall and Olkin 17], and useful stochastic inequalities have been obtained via their applications. A generalization of majorization ordering via integrals of continuous functions and other multivariate majorization ideas were treated recently by Joe [12], Joe and Verducci [13], and others. Selected applications to the comparison of stochastic heterogeneity of probability distributions were also given. (B) Additional Majorization-Related Multivariate Probability Inequalities Many additional multivariate inequalities have been obtained via majorization ordering. For example, motivated by Fact 4.5 Tong [29] proved that: FACT 5.3. If the p.d.f f(x) of X Xn) is Schur-concave, (Xl then (a) the distribution function of ]XI (IXll, IXnl), P[IXI] _< a an); (a Xn <- an], is a Schur-concave function of a (b) the probability P [7=1 (Xi/ai) 2 <- )] is a Schur-concave-function of (a21,..., a2n) for every fixed Z > O. Tong [29] also conjectured that if the conditions on f(x) in Fact 5.3 are satisfied, then for all positive even integers m > 2 the probability P[7=l(Xi/ai)m< is a Schur-concave function of for all ) > 0. In a 1983 paper Karlin and Rinott proved that this an conjecture is true. Karlin and Rinott [15] and Tong [30] also independently obtained a multivariate probability inequality for n-dimensional rectangles via multivariate majorization. For a description of that and other related results, see the survey articles Tong [31] and Tong [33]. We note that all of these results yield integral inequalities in IRn. .] (a’/m-1) m/(m-1)) 5.3 (C) A Positive Dependence Ordering via Majorization Ordering of Dimension Vectors In a 1977 paper, Tong applied Muirhead’s inequality (see Marshall and Olkin [17, p. 87] to prove the following result: THEOREM 5.4. Let X1 Xn be nonnegative random variables. If their is absolutely continuous with respect to either Lebesgue measure joint p.d.f or the counting measure, and is permutation invariant, then E Hi%l (Xi) ai is a Schur-convex function of a (al, an) EX =-- 1). A simple application of Theorem 5.4 yields the following moment inequality: FACT 5.5. If X is a nonnegative random variable, either continuous or E X ai discrete, then Hin=a lZai is a Schur-convex function of a, where tXai and lzo =- 1. Tong [32] applied these results to a large class of distributions for obtaining a partial ordering of positive dependence of random variables via a majorization ordering of the dimension vectors. Detailed applications to certain families of probability distributions and stochastic processes were given. In a recent paper, Olkin and Tong [18] applied these results to study the effects of positive dependence in reliability theory and shock models when the random variables are exponentially distributed. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of Inequalities and Applications Hindawi Publishing Corporation

Relationship between stochastic inequalities and some classical mathematical inequalities

Journal of Inequalities and Applications , Volume 1 (1) – Jan 1, 1900

Loading next page...
 
/lp/hindawi-publishing-corporation/relationship-between-stochastic-inequalities-and-some-classical-OzRXtFMhpo

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Hindawi Publishing Corporation
Copyright
Copyright © 1997 Hindawi Publishing Corporation.
ISSN
1025-5834
eISSN
1029-242X
Publisher site
See Article on Publisher Site

Abstract

School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332-0160, USA e-mail: tong @ math.gatech.edu (Received 2 May 1996) The notions of association and dependence of random variables, rearrangements, and heterogeneity via majorization ordering have proven to be most useful for deriving stochastic inequalities. In this survey article we first show that these notions are closely related to three basic inequalities in classical mathematical analysis: Chebyshev’s inequality, the Hardy-Littlewood-P61ya rearrangement inequality and Schur functions. We then provide a brief review of some of the recent results in this area. An overall objective is to illustrate that classical mathematical inequalities of this type play a central role in the developments of stochastic inequalities. Keywords: stochastic inequalities; association; rearrangements; majorization and Schur functions. 1991 AMS Subject Classifications: Primary 60E15, Secondary 62H99. 1 INTRODUCTION As noted by P61ya [21], "Inequalities play a role in most branches of mathematics and have different applications". This is certainly true in the area of probability and statistics. For example, the celebrated Chebyshev, Markov and Kolmogorov inequalities are well-known and can be found in many probability books. In the theory of estimation, the Cram6r-Rao inequality and its generalizations provide lower bounds on the variances of a large class *Research supported in part by NSA Grant No. MDA904-94-H-2032. of estimators. In statistical hypotheses testing, the Neyman-Pearson Lemma directly involves inequalities. It appears that the developments of many ofthe stochastic inequalities originated from the concepts of certain fundamental inequalities in mathematics. As a result, mathematical inequalities of this type have made a strong impact on stochastic inequalities. In this survey paper, we discuss the influence of three basic inequalities in classical mathematical analysis, andshow how certain stochastic inequalities are closely related to those classical results. Specifically, we explain how and why the concepts of (1) Chebyshev’s (other) inequality, (2) the Hardy-Littlewood-P61ya (HLP) rearrangement inequality and (3) the Schur functions are related to, respectively, (I t) inequalities via association and dependence of random variables, (2t) stochastic inequalities via arrangement increasing functions, and (3 t) stochastic inequalities via majorization. Those results are described and discussed in Sections 2, 3 and 4, respectively. For reader’s convenience, each of the three classical results is restated at the beginning of a section. Some related stochastic inequalities, which have been obtained during the past two decades, are then stated with selected applications in Section 5. For simplicity we assume that all functions and subsets involved are Borel-measurable and integrable. Nondecreasing (nonincreasing) functions will be called increasing (decreasing). Further, we note that the results described in this paper are for illustrative purposes only, hence they are neither complete nor exclusive. For additional results on these topics, the reader is referred to the bibliographies contained in the references listed at the end of this paper. The proofs of some of the results can also be found there. 2 CHEBYSHEV’S INEQUALITY, ASSOCIATION, AND STOCHASTIC DEPENDENCE We first state a classical result of Chebyshev; a convenient reference for this result is Hardy et al. ([9], p. 43). For notational consistency, the statement given here is in a modified version and is not in its original form. TIJEORM 2.1 Let G1, G2 [0, 1] IR be two real-valued functions. they are both increasing, then the inequality fo holds. G (u)G2(u)du Ifo If G (u)du lifO G2(u)du (2.1) Note that in Theorem 2.1 if the word "increasing" is replaced by "decreasing" then, by replacing Gj with -Gj (j 1, 2) in (2.1), the same conclusion also holds. When compared with his inequality that provides bounds for the tail probabilities of a distribution via the first two moments, this result seems to be less well-known. Thus Theorem 2.1 is sometimes referred to as "Chebyshev’s other inequality" in the literature. If we interpret the integrals as expectations of random variables, then this result yields a corresponding stochastic inequality in the following sense: Consider a random variable X with distribution function F (x), and let (gl (X), g2 (X)) be a two-dimensional random vector such that g l, g2 are real-valued functions. By letting u F(x) and Gj gj F -1 (j 1, 2) in (2.1), we immediately obtain THEOREM 2.1 t. If gl, g2 are both increasing or both decreasing, then E[gl (X)g2(X)] >_ [E(gl (X))][E(g2(X))] holds; or equivalently, Corr (gl (X), g2(X)) >_ 0, where E(.) denotes expectation and "Corr" stands for the correlation coefficient. Intuitively speaking, Theorem 2.1 states that if gl and g2 are both increasing or both decreasing, then gl(X) and g2(X) tend to take larger values together and smaller values together. Thus their correlation coefficient is nonnegative. The theorem then involves positive dependence of random variables which are monotone functions of a common random variable. A question of interest is whether it can be generalized to monotone functions of several random variables. This question can be answered by studying the following concept of association of random variables, which was first considered by Esary, et al. [8]" DEFINITION 2.2. For n > 1 the random variables X1 Xn are said to be associated, or the set of random variables {X1 Xn} is said to be associated, if for all given real-valued functions g l, g2 that are increasing in each component when the other components are held fixed, the inequality =1 gj(X) > j=l E(gj(X)) holds, or equivalently, Corr (g (X), g2(X)) > 0, where X (2.2) et al. Xn ). [8] proved the following theorem: Esary, THO 2.3. (a) A set consisting of a single random variable is a set of associated random variables. (b) Independent random variables are associated random variables. (c) A subset of a set of associated random variablesforms a set ofassociated random variables. (d) Increasingfunctions of associated random variables are associated random variables. From a historical viewpoint, the Esary-Proschan-Walkup paper was motivated by a research problem in application and was not a direct outgrowth of the Chebyshev inequality. But, nevertheless, the notion is clearly a generalization of that in Theorem 2.1. In fact, Theorem 2.3(a) is equivalent to Theorem 2.1. A repeated application of Theorem 2.3 yields the following multivariate probability inequality. Note that, in particular, the result applies to independent random variables. FACT 2.4. For k >_ 2 let gl gk IRn -+ IR be increasing. IfX1 are associated random variables, then Xn kj--1 (gj(X) >_ .j) > P (gj(X) _> )j) P (gj(X) >_ .j) (2.3) >-- I-I P [gJ (X) >_ ;j] j=l holds for all U < k and arbitrary but fixed real numbers ;1, Fact 2.4 has a special application to stochastic processes with independent increments: FACT 2.5. Let S {Yt T} be a stochastic process of independent increments, where the parameter space T is either discrete or continuous. For k >_ 2 let gl IRn IR be increasing functions. Then gk arbitrary but fixed tl < < tn in T the set of random variables for {gl (Yt gk(Yti Yt,) Yr,)} is associated. Thus, when substituting (Ytl Yt,) for X in (2.3), the inequalities hold for all U < k and all ,1 ,k. Fact 2.5 has certain applications in boundary-crossing type of problems, and it represents a generalization of some earlier results. For example, Robbins [23] previously obtained the last inequality in (2.3) for k n and k) via a direct verification. His result now Ytn) Ytj (j 1 gj (Ytl follows from Fact 2.5 as a special case. The notion of association and monotone transformations of random variables arise in many problems in probability and statistics. For an extensive application in reliability theory, especially on reliability bounds for coherent systems, see Barlow and Proschan 1, Chapter 2]. In multivariate statistical analysis, Fact 2.4 is often applied to reduce the dimensionality of the joint distribution of random variables. In this area there are other related notions for defining positive dependence of random variables. A notion that is stronger than association is the multivariate totally-positive-of-order-2 (MTP2) property of the joint probability density function (p.d.f.). For a comprehensive treatment of TP2 functions and related results, see Karlin 14]. For a description of the orderings of the notions of positive dependence, see Barlow and Proschan [1, Chapter 5] and Tong [28, Chapter 5]. It is known that for the multivariate normal distribution all of these notions are equivalent. As a result, many special results for this distribution have been obtained. For example, Pitt’s [20] result states that if X (X1 Xn) is distributed according to a multivariate normal distribution, then X1 Xn are associated if and only if all of the simple correlation coefficients are nonnegative. 3 THE HLP REARRANGEMENT INEQUALITY AND ARRANGEMENT INCREASING (AI) FUNCTIONS The HLP rearrangement inequality, due to Hardy et al. [9, Chapter 10], is an algebraic inequality for inner products. For n > 2 let a an) (al and b (b bn) be two real vectors. Their inner product is defined to n be ab r Yi=I aibi. The problem of interest is to find the maximum and minimum through all permutations of the components of a and b. The HLP rearrangement inequality states that THEOREM 3.1. Let (a[1] a[n]) and (a() a(n)) be two vectors ob> a[n] tained from rearranging the components of a such that a[] > and a( <_ <_ a(n), and let (bill,..., bin]), (b(1) be defined b(n)) similarly. Then for all a, b the inequalities a(i)b[i] < aibi <_ a(i)b(i) hold. Thus, the inner product of a, b is minimized (maximized) over all possible permutations when their components are completely reversely ordered (similarly ordered). < an already Without loss of generality, let us assume that al < a2 < holds. We shall denote such a vector by the symbol a ’. If the components of b are not in an ascending order, then there exist two integers r < s such that br > bs. Let bl denote the vector obtained from b by interchanging br and bs in b. We use the symbol b bl to mean that br and bs are now in an ascending order in the new vector bl. If b* b) is another vector such that (b there exists N vectors bl bN satisfying b bl <... < bN b*, then we can change b into b*, in a finite number of rearrangements by interchanging two components at a time in this fashion. In this case we write b b*. Adopting such an approach, Sobel [26] obtained the following generalization of Theorem 3.1: P THEOREM 3.1 t. Let a, b and b* be n-dimensional real vectors. then (a ’)b T < (a ’)(b*) T. If b P b*, In a 1977 paper, Hollander, Proschan and Sethuraman studied functions that are decreasing in transposition. Subsequently, such functions are called "arrangement increasing" in the Marshall-Olkin 17, Section 6.F) book. The definition of such a function is given below: DEFINITION 3.2. b(a, b) IRn IRn IR is said to be an arrangement increasing (AI) function of (a, b) if (i) b (a, b) b (’(a), ’(b)) for (yr every vector n), Zrn) which is a permutation of (1 where yr(al an) ar); (ii) b b* implies b(a ’, b) < (arl q(a b*). P With this definition, Theorem 3.1 essentially says that b(a, b) ab T is an AI function. When applied to probability and statistics, the problem of interest is to develop some basic results for showing that certain functions are AI, thus to obtain probability inequalities via rearrangements. Hollander et al. [10] proved a fundamental preservation theorem and gave many such results. In particular, they proved a theorem that involves the ranks and the parameters of random variables from several populations. Let .T" {f (x, 0) 0 6 f2} denote a family of univariate p.d.f.’s while fl C IR is the parameter space. f" is said to possess the monotone likelihood ratio (MLR) property if, for all 01 < 02 in fl the ratio f (x, 02)/f (x, 01) is an increasing function of x. This property represents a notion of stochastic ordering of random variables in the sense that Eo g(X) is increasing in 0 for all increasing functions g. Thus a random variable X associated with a larger 0 is stochastically larger in this sense. Now, let X1 Xn be n independent random variables with p.d.f.’s n let Ri denote the rank f (x, 01),..., f (x, On), respectively. For 1 of Xi in the combined sample (X1 Xn). Then, intuitively speaking, the rank of a random variable with a larger 0 value tends to be higher. This is affirmed in the following theorem due to Hollander et al. [10]: TrtEOREM 3.3. then If.T" has the MLR property and X1 Xn are independent, rn] 4(0, r) Po[R r is an Al function of O (01 On) and r Rn (rl rn). Theorem 3.3 has many useful applications in rank order statistics and nonparametrics. For details, see Hollander et al. [10] and the related references. Other stochastic applications of arrangement functions include statistical inference problems based on order statistics. In particular, the ranking and selection problems concern the selection of the populations associated with the larger parameters. By applying Theorem 3.1 it can be shown algebraically that if the family of p.d.f.’s, .T’, is an exponential family, then the corresponding likelihood function is an AI function of (0, x); thus it is maximized when the parameters 0 On and the observations x Xn are similarly ordered. This property was applied extensively by Bechhofer, Kiefer and Sobel in their 1968 monograph. 4 MAJORIZATION, SCHUR FUNCTIONS, AND STOCHASTIC INEQUALITIES The notion of majorization concerns the heterogeneity of the elements of a real vector. Let a (al bn) be two real vectors and an), b (bl the a[i]’s, b[i]’s be defined as in Theorem 3.1. m DEFINITION 4.1. a is said to majorize b, in symbols a >- b, if a[i] >_ n 1 and equality holds for rn n. b[i] holds for rn 1, This definition provides a partial ordering of the "heterogeneity" of the ) components of a and b. In particular, it is easy to see that (i) a >- (h holds for all h where h is the arithmetic mean, (ii) if ai > 0 (i 1 n), then (i=ln ai ,0 0) >- a. It is known that: n n THEOREM 4.2. (a) a >- b holds iff 1-Ii= h (ai) >_ IIi= h (bi) holds for all aP for some n n continuous convex functions h. (b) a >- b holds iff b doubly stochastic matrix P. Definition 4.1 is closely related to the following definition of Schur-convex and Schur-concave functions: IR is said to be a Schur-convex (Schur-concave) function if a >- b implies (a) >_ (b) ((a) <_ (b)). DEFINITION 4.3. The basic ideas of majorization and Schur functions have played a role in classical mathematical analysis. In their monograph, Marshall and Olkin 17] gave a comprehensive and complete treatment of this topic, including its historical developments, basic theory and related inequalities and various applications that were known before 1979. In the rest of this section we provide a review of some existing stochastic inequalities derived via majorization. An earlier result, due to Marshall and Olkin 16], can be stated in the form of an integral inequality: IRn THEOREM 4.4. If f and g are Schur-concave functions defined on IRn, then the function defined on IRn by [f(O) is a Schur-concave function. flRn g(O x)f(x)dx By using the property that (0 x) >- (0" x*) iff (x 0) >- (x* 0"), and by letting g be the indicator function of a Schur-concave subset in IRn, Theorem 4.4 implies the following result given in Marshall and Olkin 16]: FACT 4.5. If the p.d.f, f(x) of an n-dimensional random vector X Xn) is Schur-concave, then the distribution function F(a) (X1 an). P[X1 < al Xn < an] is a Schur-concave function of a (al Proschan and Sethuraman [22] applied the results in majorization theory to study stochastic majorization and provided the following definition: DEFINITION 4.6. A random vector X (X1,..., Xn) is said to stochastically majorize V (Y1 Yn) if E((X)) >_ (<)E(gt(V)) holds for all Schur-convex (Schur-concave) functions They then obtained several useful theorems via this stochastic majorization ordering. One such result deals with the probability contents of a Schurconcave set for nonnegative random variables with TP2 and semigroup properties. A convenient reference of that theorem and its applications is Marshall and Olkin [17, p. 101). Hollander et al. [10] also showed that if t IRn )< IRn IR is of the form (a, b) g(a b) for some function IR, then q is AI iff g(x) is a Schur-concave function of x. g IRn This result yields rearrangement inequalities via majorization for location parameter families in statistics. 5 SOME RECENT DEVELOPMENTS There are many recent results in stochastic inequalities that have been derived through association of random variables, AI functions and majorization theory. Due to space limitations, we describe only a few of them in this section. Interested readers may find additional results from the bibliographies contained in the references given in this paper. Positive Association and Negative Association The random variables that satisfy Definition 2.2 tend to take larger values together and smaller values together. Thus they may be called "positively associated". Many useful inequalities derived via this notion have become available after the publication of the Esary et al. [8] paper, and some of them are originated from applications. A comprehensive reference on their applications in reliability theory can be found in the Barlow-Proschan [1] book. More recent applications in this area can be found in the review article by Boland et al. [6]. Some of the recent results and their selected applications in biology and medicine, business and economics, operations research (including queubing theory), and statistical inference can be found in Block et al. [3], Shaked and Tong [25] and Shaked and Shanthikumar [24]. The concept of negative association was introduced by Block et al. [4], Joag-Dev and Proschan [11 and others. There have been different forms of the definition of negative association and a convenient one is the following (Joag-Dev and Proschan 11 ]): DEFINITION 5.1. Random variables X1,..., Xn are said to be negatively associated iffor all disjoint subsets C1 {rl, Sm,} rm} and C2 {Sl of 1 n and all increasing functions g ]Rm -+ IR and g2 IRm’ IR the inequality E [gl E [gl (Srl (Srl Xsm,)] Srm) g2 (Xs1 E [g2 (Xs, Ssrn,)] Xrm)] holds. It follows immediately from Definition 5.1 that if X 1,..., negatively associated, then Xn are {Xri >_ Zi}, {Xs >_ il <e A{Xri il AIXsi (i for all i and t Certain applications of negative association of random variables to specific distributions have been studied in the literature. For example, Block et al. [4] proved that if the joint distribution of X1 Xn is multinomial, then they are negatively associated. 5.2 Stochastic Arrangement Increasing and Multivariate Arrangement Increasing Functions Certain generalizations of the notion of arrangement increasing functions have been made recently for deriving new multivariate probability inequalities. For example, Boland et al. [6] applied the fundamental preservation theorem in Hollander et al. 10] to obtain the following result: THEOREM 5.2. Assume that (X1 Xn) is a random vector with a p.d.f. that is permutation invariant. Let h 1, h2 be Al functions on ]Rn X IRn, and let g l, g2 IR --+ ]R be increasing. Then q(a, b) is an Al function E[gl (hi (a, X))g2(h2(X, b))] of (a, b) e ]Rn the distribution of X. IRn, where the expectation is taken over By applying this result to permutation invariant distributions and to families of probability distributions with AI density functions, many new results were given in Boland et al. [6]. For example, Theorem 5.2 immediately implies that if f (x) is permutation invariant, then P[al <_ X1 <_ bl an <_ Xn < bn] is an AI function of (a, b). (This is an earlier unpublished result of Boland.) For a convenient reference of these results, see Chapter 16 of Pe6arc et al. 19]. Another set of new results on this topic were given by Boland and Proschan [5] through multivariate generalization of AI functions. They provided a useful definition of multivariate AI functions of the form 4 ]Rn )< IRn x x IRn IR, proved a main theorem, showed that many classical results are special cases of the main theorem, and then existing applied it to derive new results in multivariate probability inequalities. For details, see Boland and Proschan [5] or Chapter 17 of Pearc et al. [19]. 5.3 Some Recent Results and Generalizations of MajorizationRelated Inequalities In the area of majorization-related stochastic inequalities, many new results have become available since the publication of the Marshall-Olkin 17] book. The following is a description of some of them: 5.3 (A) Generalized Majorization and Multivariate Majorization There exist several generalizations of the notion of majorization, including multivariate majorization, majorization orderings for continuous functions, and other related orderings. Three types of multivariate majorization were already treated in Chapt6r 15 of Marshall and Olkin 17], and useful stochastic inequalities have been obtained via their applications. A generalization of majorization ordering via integrals of continuous functions and other multivariate majorization ideas were treated recently by Joe [12], Joe and Verducci [13], and others. Selected applications to the comparison of stochastic heterogeneity of probability distributions were also given. (B) Additional Majorization-Related Multivariate Probability Inequalities Many additional multivariate inequalities have been obtained via majorization ordering. For example, motivated by Fact 4.5 Tong [29] proved that: FACT 5.3. If the p.d.f f(x) of X Xn) is Schur-concave, (Xl then (a) the distribution function of ]XI (IXll, IXnl), P[IXI] _< a an); (a Xn <- an], is a Schur-concave function of a (b) the probability P [7=1 (Xi/ai) 2 <- )] is a Schur-concave-function of (a21,..., a2n) for every fixed Z > O. Tong [29] also conjectured that if the conditions on f(x) in Fact 5.3 are satisfied, then for all positive even integers m > 2 the probability P[7=l(Xi/ai)m< is a Schur-concave function of for all ) > 0. In a 1983 paper Karlin and Rinott proved that this an conjecture is true. Karlin and Rinott [15] and Tong [30] also independently obtained a multivariate probability inequality for n-dimensional rectangles via multivariate majorization. For a description of that and other related results, see the survey articles Tong [31] and Tong [33]. We note that all of these results yield integral inequalities in IRn. .] (a’/m-1) m/(m-1)) 5.3 (C) A Positive Dependence Ordering via Majorization Ordering of Dimension Vectors In a 1977 paper, Tong applied Muirhead’s inequality (see Marshall and Olkin [17, p. 87] to prove the following result: THEOREM 5.4. Let X1 Xn be nonnegative random variables. If their is absolutely continuous with respect to either Lebesgue measure joint p.d.f or the counting measure, and is permutation invariant, then E Hi%l (Xi) ai is a Schur-convex function of a (al, an) EX =-- 1). A simple application of Theorem 5.4 yields the following moment inequality: FACT 5.5. If X is a nonnegative random variable, either continuous or E X ai discrete, then Hin=a lZai is a Schur-convex function of a, where tXai and lzo =- 1. Tong [32] applied these results to a large class of distributions for obtaining a partial ordering of positive dependence of random variables via a majorization ordering of the dimension vectors. Detailed applications to certain families of probability distributions and stochastic processes were given. In a recent paper, Olkin and Tong [18] applied these results to study the effects of positive dependence in reliability theory and shock models when the random variables are exponentially distributed.

Journal

Journal of Inequalities and ApplicationsHindawi Publishing Corporation

Published: Jan 1, 1900

There are no references for this article.