# Bidimensional Matching with Heterogeneous Preferences: Education and Smoking in the Marriage Market

Bidimensional Matching with Heterogeneous Preferences: Education and Smoking in the Marriage Market Abstract We develop a frictionless matching model under transferable utility where individuals are characterized by a continuous trait and a binary attribute. The model incorporates attributes for which there are heterogeneous preferences in the population regarding their desirability, that is, the impact of the traits cannot be summarized by a one-dimensional attractiveness index. We present a general resolution strategy based on optimal control theory, and characterize the stable matching. We then consider education and smoking status, further specify the model by observing that there are more male than female smokers above each education level, and derive additional predictions about equilibrium matching patterns and how individuals with different smoking habits “marry down” or “marry up” by education. Using the CPS March and Tobacco Use Supplements for the period 1996–2003, we find that the hypotheses based on our model predictions are borne out in the data. 1. Introduction Empirical evidence strongly suggests that matching processes in the marriage market are multidimensional. Spouses tend to be similar in a variety of characteristics, including age, body mass index, education, race, religion, and smoking status (Becker 1991; Oreffice and Quintana-Domeque 2010; Qian 1998; Silventoinen et al. 2003; Sutton 1980; Weiss and Willis 1997). New developments in the matching literature have started to address the multidimensional nature of attractiveness by considering settings with two or more attributes. In particular, several recent studies1 consider frameworks where multiple characteristics can be summarized by a single, one-dimensional attractiveness index, so that, technically, the matching process is de facto one-dimensional. Such a single-index approach is a powerful tool, and it allows to apply the standard one-dimensional matching techniques to settings with multiple attributes. However, its validity relies on a strong homogeneity assumption: it must be the case that the trade-offs between individual female traits are identically perceived by all men (and similarly for male characteristics). This approach, appealing as it may seem, is not always appropriate. When preferences for any of the relevant matching characteristics are heterogeneous within either the male or the female populations (or both), matching becomes multidimensional, as long as the traits at stake are not perfectly correlated. That is, the corresponding traits cannot be collapsed into a single index, and one-dimensional matching techniques are not well-suited to characterize the corresponding stable matches. In the marriage market, several characteristics can be heterogeneously assessed by potential mates, including race, ethnicity, age, health attributes, or smoking status. For instance, we know that smokers are more likely to marry smokers (Clark and Etilé 2006; Maralani 2009; Sutton 1980; Venters et al. 1984), and that a large medical and public health literature essentially shows that nonsmokers mind their partner’s smoking status, whereas smokers do not.2 The goal of this paper is to extend the standard matching model under Transferable Utility to a particular case of multidimensional matching, where one of the characteristics is assessed heterogeneously by potential mates. In particular, we present a general resolution strategy and characterize the stable matching when individuals differ in two characteristics.3 We also suggest a new way of testing matching models. In most one-dimensional matching models, theoretical implications regarding who marries whom are straightforward; essentially, they boil down to supermodularity implying assortative matching. In multidimensional contexts, however, things are much more complex, since theory may imply specific properties on matching patterns along the various dimensions under consideration. We argue that such predictions may be taken to the data and tested in reduced form. Specifically, in this paper we study a parsimonious model and derive a series of predictions. We then test hypotheses based on these predictions, and find they are indeed supported by the data. Although our hypotheses tests can be formally derived from several specific stochastic structures found in the literature, we argue that they do not require explicit assumptions on the stochastic structure of unobservables; therefore, and in the same spirit as previous tests of index-based models (e.g., Chiappori, Oreffice, and Quintana-Domeque 2012), such tests can be viewed as a useful complement to more structural approaches. In our model, individuals are characterized by two dimensions. One characteristic is continuous, and can be interpreted as an index of socioeconomic status (from now on SES), reflecting differences in education, income, social prestige, and others, or any combination of those. The other characteristic is discrete, and more precisely dichotomic. We suggest to interpret the second characteristic as the individual’s smoking status. The marital surplus function is assumed to be differentiable and supermodular in the continuous indices, as is common in the literature, and to be multiplicatively impacted by the discrete characteristic: it may be diminished by the presence of a smoker in the couple. As long as smoking and socioeconomic status are not perfectly correlated (and they are not), this heterogeneity in preferences is a key feature of our setting: it rules out a single-index representation, since the trade-off between the two characteristics is perceived differently among potential spouses, which is precisely what index models forbid. We first analyze, as a benchmark, a fully symmetric version of the model, in which male and female characteristics play the same role in the surplus function and are identically distributed. We show that the resulting stable matching exhibits full segregation, in the sense that smokers exclusively marry smokers. This result is interesting per se, since it does not depend on the magnitude of the negative impact of smoking. In our “pure” frictionless framework, therefore, even minor differences in preferences (or surplus) may result in large-scale segregation; moreover, the latter is efficient, in the sense that it does maximize total welfare. We then consider the general asymmetric case. We prove existence and generic uniqueness of the stable matching and present a general resolution strategy, which can be used in any discrete or continuous multidimensional framework. Even in our simple framework, a closed-form characterization of the stable matching cannot be obtained in the general case; however, we derive general predictions on the form that the stable matching may take. The analysis of a specific, quadratic case in which a closed form solution exists is provided in our Supplementary Material. Next, we further specify the model by assuming that, for any SES level, there are more male smokers than female smokers among individuals above that SES level, a pattern that we actually observe in the data. This generates a set of additional predictions regarding the nature of the stable matching. First, there are no “mixed” couples in which she smokes and he does not, whereas the opposite pattern—he smokes and she does not—happens with positive probability. Second, smoking husbands married to smokers are of higher “quality” (i.e., higher SES) than those married to nonsmokers: smoking “premium” for smoking wives. Third, and conversely, nonsmoking wives married to smoking husbands have a lower SES than those married to nonsmokers: smoking “penalty” for smoking husbands. Fourth, there is positive assortative matching on SES among couples with identical smoking habits. Fifth, positive assortative matching on SES by smoking status is stronger at the top of the SES distribution: smoking husbands (wives) at the top of the SES distribution are more likely to marry smoking wives (husbands) than those with lower SES. Similarly, nonsmoking husbands (wives) at the top of the SES distribution are more likely to marry nonsmoking wives (husbands) than those with lower SES. Finally, some female nonsmokers may marry either a smoker or a nonsmoker with positive probability; then it must be the case that the smoking husband has a higher SES. Similarly, some male smokers may marry either a smoker or a nonsmoker; but then both women have the same SES. We use the Current Population Survey March Supplements data combined with the Tobacco Use Supplements (TUS) for the period 1996–2003. These TUS supplements, which are widely used in medical research on smoking, provide the largest representative sample of the US population, and, crucially, allow us to retrieve information on both spouses. We follow a reduced-form approach to test the hypotheses based on the theoretical predictions on the stable matching derived from our deterministic model, acknowledging that our general model could not be solved under a general stochastic structure. Our approach is compatible with a Choo–Siow specification, that is, a framework where unobserved heterogeneity can be fully captured by an additive random term, which is moreover the sum of two random variables with type 1 extreme value distributions; but it could also be used in alternative settings.4 In our deterministic model, stability conditions imply that there should be no “mixed” couples in which she smokes and he does not, whereas the opposite pattern—he smokes and she does not—should happen with positive probability (the asymmetry between genders being due to the larger prevalence of male smoking). In a more complex setting, the presence of either search frictions (à la Shimer and Smith 2000) or unobserved matching characteristics (or both) typically results in positive probabilities for all possible couples. Still, a natural hypothesis is that mixed couples where the wife smokes should be much less frequent than vice versa; that is, the ratio of the two subpopulation sizes should be significantly lower than what would be implied by the sole difference in relative smoking prevalence.5 To test this hypothesis, we compare the actual subpopulations ratio to what it should be under independence for each of the nine US Geographical Census divisions; interestingly, we find that the hypothesis is satisfied in each of them, despite significantly different smoking prevalences across regions. This is consistent with our first prediction. Similarly, our regression analysis shows that among smoking husbands, those who marry smoking wives exhibit (on average) 0.15 more years of completed education (or about a 1.2% higher annual earnings) than those with nonsmoking wives, consistent with our second prediction. Conversely, our evidence reveals that among nonsmoking wives, those with smoking husbands exhibit (on average) 0.11–0.13 fewer years of completed education than those with nonsmoking husbands, supporting our third prediction. Consistent with our fourth prediction, we also find positive assortative matching on education for each type of couple, and in particular for couples with identical smoking habits. In addition, at the top of the SES distribution, positive assortative matching on education by smoking status is stronger, supporting our fifth prediction. Finally, the well-known negative correlation between education and smoking is confirmed in our data, and we estimate that for men this correlation becomes less negative if one further controls for the wife’s education, whereas this pattern does not appear for women; we argue that this fact is in line with our last prediction. Perhaps the most informative prediction of our bidimensional matching model is the one stating that smoking husbands married to smokers have higher SES than those married to nonsmokers. Such a prediction is typical of a matching logic, in which female smokers eventually benefit from being on the short side of the market. A potential concern here is that such empirical patterns (even when controlling for additional covariates), instead of being a consequence of matching mechanisms on the marriage market, may reflect the fact that smoking behavior is endogenous to marital status. However, we doubt this can explain our findings. Using information on engaged couples, newlyweds, and couples married for over 5 years, Sutton (1993) finds that similarities in smoking status were already present about the time of marriage. More recently, Banks, Kelly, and Smith (2013), who use retrospective evidence from the Health and Retirement Study, show that most smoking behavior is initiated before marriage, so that smoking is predetermined with respect to marital status. In addition, we use a simple and popular estimator in program evaluation (see Wooldridge 2002) obtained from an OLS regression of education on spousal smoking status controlling for the predicted probability that the spouse is a smoker (and its square). The key variable(s) used in predicting such a probability is (are) the age at which the spouse started smoking regularly (and its square). Reassuringly, the estimates adjusted by the spousal propensity to smoke give a similar picture to our main estimates. Finally, we present two refutability tests in our Supplementary Material: one based on randomly sorting married men and women to couples, which shows that the predictions of our model are rejected; another based on testing a single-index model in education and smoking, which is rejected by the data. The next section contains the model. Section 3 describes the data. Section 4 tests the hypotheses based on the theoretical predictions of our model. Section 5 presents a brief discussion, whereas Section 6 concludes. 2. The Model We present a parsimonious preference model given the two types of characteristics at stake, defining the surplus structure and the corresponding stable matches under Transferable Utility (from now on TU).6 After characterizing the equilibrium under a benchmark “symmetric” case (where male and female characteristics play the same role in the surplus function and are identically distributed), we analyze the general case deriving predictions on the form that the stable matching may take, which can be used in any discrete or continuous multidimensional framework, although a closed-form characterization of the stable matching cannot be obtained. Finally, we introduce the male prevalence condition into the model (for any SES level, there are more male smokers than female smokers among individuals above that SES level), and derive several predictions that will be the basis of the hypotheses in our empirical analysis. 2.1. The Basic Framework 2.1.1. Populations We consider two populations (men and women) of equal size, normalized to one. Agents differ in two respects. First, they are characterized by a continuous index; one may, without loss of generality, assume that this index is uniformly distributed over the interval [0, 1]. A possible interpretation is in terms of socioeconomic status (SES); then the index depends on the agent’s income, education, prestige, or any combination of those. Second, agents are also characterized by some dichotomous indicator taking values in the set {N, S}; in our empirical application, S stands for “smoker” and N for “nonsmoker”, although alternative interpretations are possible. An agent is thus formally characterized by a pair (x, X) if female and (y, Y) if male, where x or y ∈ [0, 1] is the agent’s continuous index (i.e., SES), and X, Y ∈ {N, S} defines the agent’s discrete characteristic (i.e., smoking status). Let F (resp. G) denote the cumulative distribution of female (male) characteristics (x, X) ((y, Y)) over the set [0, 1] × {N, S}, and let FX(x) (GY(y)) denote the number of females (males) with smoking status X (Y, with X, Y ∈ {N, S}) and SES no larger than x (y). In particular, FX(1) and GY(1), respectively, denote the total number of females and males with smoking habits X and Y. 2.1.2. Surplus In any married couple, the sum of individual utilities is given by some function of the partners' characteristics; as it is customary, we define the surplus generated by marriage as the difference between this function and the sum of utility levels that each spouse would reach as single. In our framework, the surplus depends on both the discrete and the continuous characteristics of each partner. We assume that the surplus Σ generated by a match between (x, X) and (y, Y) has the form   \begin{eqnarray*} \Sigma \left( \left( x,X\right) ,\left( y,Y\right) \right) = \left\lbrace \begin{array}{l@{\quad}l}S\left(x,y\right) & {\rm if}\, {X=Y=N},\\ \lambda S\left( x,y\right) & {\rm otherwise}. \end{array}\right. \end{eqnarray*} The function S is strictly increasing, continuously differentiable and supermodular; moreover, λ < 1. This means that the impact on marital surplus of the spouses’ smoking habits is fully summarized by a single parameter λ, which represents the decrease in surplus due to the presence of (at least) a smoker in the couple. Hence, in our framework, the surplus of a mixed (smoker, nonsmoker) couple is the same as that of a couple of smokers, but strictly less than of a nonsmoking pair. Restrictive as it may seem, we believe that the common λ assumption is a reasonable approximation. The multiplicative nature of the impact of λ reflects the fact that smoking, by reducing life expectancy, decreases the present discounted value of future welfare proportionally. We believe that this is more realistic than an additive parameter, as the latter would imply that those couples with lower socioeconomic resources (and surplus) would be disproportionately penalized by having (at least) a smoker in the couple. In addition, a large medical and public health literature7 shows that a smoking partner decreases life expectancy of a nonsmoker but not of a smoker, which is compatible with our assumption.8 Last but not least, although the single parameter λ assumption cannot be directly tested (since the marital surplus is not observed), one can evaluate its plausibility using auxiliary conditions of the Choo–Siow type, such as unobserved heterogeneity of type 1 extreme value distribution form. We perform this exercise and find empirical support for it (see Table E.1 in the Supplementary Material). 2.2. Stable Matching 2.2.1. Definition A matching is defined as a measure μ on the set ([0, 1] × {N, S})2 and four functions uN(x), uS(x), vN(y) and vS(y). Intuitively, for two sets A, B ⊂ [0, 1] × {N, S}, μ[A, B] denotes the probability that a woman belonging to A is married with a man belonging to B; and for any female (x, X) (male (y, Y), with X, Y ∈ {N, S}), uX(x) (vY(y)) is the utility she (he) receives at a stable matching. A constraint on μ is that its marginal should equal the initial distributions of individuals; that is, the marginal on the set of females (males) is F (G). In addition, on the support of μ, individual utilities satisfy   \begin{equation*} u_{X}\left( x\right) +v_{Y}\left( y\right) =\Sigma \left( \left( x,X\right) ,\left( y,Y\right) \right) ,\forall \left( \left( x,X\right) ,\left( y,Y\right) \right) \in \mathit{Supp}\left( \mu \right) , \end{equation*} reflecting the fact that if two agents may marry with positive probability, their individual utilities must add up to the surplus they generate when married. A matching is stable if no matched agent would be better off unmatched, and if no two individuals would prefer being matched together to their current situation. Normalizing singles’ utility to zero, stability can be summarized by the following set of inequalities: for any (x, X), (y, Y) we have that   $$u_{X}\left( x\right) \ge 0,\quad v_{Y}\left( y\right) \ge 0 \quad {\rm and} \quad u_{X}\left( x\right) +v_{ Y }\left( y\right) \ge \Sigma \left( \left( x,X\right) ,\left( y,Y\right) \right),$$ (1)therefore   $$u_{X}\left( x\right) +v_{ Y }\left( y\right) \ge \left\lbrace \begin{array}{@{}l@{\quad }l@{}}S\left( x,y\right) & {\rm if} \, {X= Y =N},\\ \lambda S\left( x,y\right) & {\rm otherwise}, \end{array}\right.$$ (2)where an equality obtains on the support of μ. The first constraints in (1) reflect the requirement that married people should prefer marriage to singlehood; the second constraint in (1) expresses that any two individuals cannot, by forming a new match, strictly increase their current utilities. 2.2.2. Existence Existence of a stable match stems from general results, which state that in a TU context, the minimization of aggregate utility over the set of stable matches is equivalent to the maximization of aggregate surplus over all possible assignments.9 Formally, if (μ, uN(x), uS(x), vN(y), vS(y)) is a stable matching, then the measure μ solves   $$\max _{\nu \in \mathcal {M}}\int \Sigma \left( \left( x,X\right) ,\left( y,Y\right) \right) d\nu \left( \left( x,X\right) ,\left( y,Y\right) \right),$$ (3)where $$\mathcal {M}$$ denotes the set of measures on the set ([0, 1] × {N, S})2 whose marginal distributions coincide with the initial measures F and G on the female and male populations, respectively. Since this set is compact and Σ is continuous in x and y, a solution exists. Conversely, for any solution $$\bar{\mu }$$ to the surplus maximization problem, consider the dual program   \begin{eqnarray*} &&{\min _{u_{N},u_{S},v_{N},v_{S}}\int _{\left[ 0,1\right] \times \left\lbrace N,S\right\rbrace }\left( \boldsymbol{1}\left[ X=S\right] u_{S}\left( x\right) +\boldsymbol{1}\left[ X=N\right] u_{N}\left( x\right) \right) dF\left( x,X\right)} \\ &&\quad {} +\int _{\left[ 0,1\right] \times \left\lbrace N,S\right\rbrace }\left( \boldsymbol{1}\left[ Y=S\right] v_{S}\left( y\right) +\boldsymbol{1} \left[ Y=N\right] v_{N}\left( y\right) \right) dG\left( y,Y\right) \end{eqnarray*} under the constraints in (1). If $$\left( \bar{u}_{N},\bar{u} _{S},\bar{v}_{N},\bar{v}_{S}\right)$$ denotes a solution, then $$\left( \bar{\mu },\bar{u}_{N},\bar{u}_{S},\bar{v}_{N},\bar{v}_{S}\right)$$ defines a stable matching. Let us note that there may exist matches involving “mixed strategies”, whereby an open set of agents are each matched to several agents with positive probabilities. In our setting, an individual may be indifferent between two types of mates. For instance, if a woman with SES x0 is a nonsmoker, the partial of the surplus with respect to x is ∂S(x0, y1)/∂x if she marries a nonsmoker with SES y1, and λ∂S(x0, y2)/∂x if she is matched with a smoker with SES y2. Although S is strictly supermodular, we may still have that   \begin{equation*} \frac{\partial S}{\partial x}\left( x_{0},y_{1}\right) =\lambda \frac{\partial S}{\partial x}\left( x_{0},y_{2}\right) \end{equation*} with y2 > y1 since λ < 1. 2.3. The Symmetric Case As a benchmark, we consider a setting that is totally symmetric between genders. This implies that (i) male and female characteristics play the same role in the surplus function (S(x, y) = S(y, x)) and (ii) the distributions of characteristics are identical for men and women (F = G). Then, the stable matching can easily be characterized. Proposition 1. Under the symmetry assumptions (i) and (ii) above, there exists a unique stable matching, which is completely assortative. Smokers only marry smokers, and nonsmokers only marry nonsmokers; moreover, in each couple, spouses have the same SES. $$\textit{Proof}$$. It is well known that a matching is stable if and only if the corresponding measure maximizes total surplus. In this context, positive assortativeness within each smoking category directly follows from supermodularity. We simply need to show that it cannot be the case that an open set of nonsmokers of one gender marry smokers. Assume it is; then an equal measure open set of nonsmokers of the opposite gender also marry smokers. But since λ < 1, the total surplus is then less than in the completely assortative matching, a contradiction. Although the symmetric case is obviously very specific, it constitutes an interesting benchmark. A first lesson that can be drawn from it is that bidimensional matching of the type under consideration naturally leads to segregated outcomes. In the symmetric context, even if the loss incurred when a nonsmoker marries a smoker is very small (i.e., λ is very close to one), the marriage patterns exhibit complete segregation, in the sense that smokers exclusively marry smokers and nonsmokers exclusively marry nonsmokers. In other words, minor differences in preferences may have a spectacular impact on marital patterns, particularly in terms of segregation. In addition, one can easily compute each spouse’s utility at a stable matching (see Appendix). The preceding conclusions, however, heavily rely on the very specific features of the symmetric framework. In particular, no trade-off exists between the two characteristics at the stable matching: there is no point, for a nonsmoker, in considering a smoker as a spouse, since a nonsmoker with exactly the same SES is always available at equilibrium. 2.4. The General Case We now consider the general case. In what follows, let pN(x) be the probability that a nonsmoking woman with SES x marries a smoker (then 1 − pN(x) is the probability she marries a nonsmoker). We define similarly pS(x΄), qN(y) and qS(y΄) as the probability of marrying a smoker for a female smoker, a male nonsmoker and a male smoker, respectively. These probabilities are endogenous, and are determined by the equilibrium (stability) conditions. 2.4.1. Qualitative Results In this subsection, we provide some qualitative properties of the equilibrium, which hold true irrespective of the exact distribution of smokers and nonsmokers in the population and the exact form of the (supermodular) surplus. A first result expresses the fact that, at a stable matching, matching is positive assortative on SES among couples with identical smoking habits. Proposition 2. Consider two matched couples, (x, X), (y, Y) and (x΄, X), (y΄, Y) with identical smoking status. For almost all such couples, x ≥ x΄ if and only if y ≥ y΄. $$\textit{Proof}$$. Assume, for instance, that x ≥ x΄ but y < y΄ on a subset of positive measure. The surplus generated by any two such couples is   \begin{equation*} \Sigma _{1}=\Sigma \left( \left( x,X\right) ,\left( y,Y\right) \right) +\Sigma \left( \left( x^{\prime },X\right) ,\left( y^{\prime },Y\right) \right) \end{equation*} whereas the matching (x, X), (y΄, Y) and (x΄, X), (y, Y) would generate a surplus   \begin{equation*} \Sigma _{2}=\Sigma \left( \left( x,X\right) ,\left( y^{\prime },Y\right) \right) +\Sigma \left( \left( x^{\prime },X\right) ,\left( y,Y\right) \right) >\Sigma _{1} \end{equation*} by strict supermodularity of Σ in (x, y). Should this situation exist for a non-null set of couples, the matching would not maximize total surplus, a contradiction. The second result states that among individuals with high SES, matching is assortative on both SES and smoking status. Proposition 3. Assume that the upper bound of the support of the measures FS, FN, GS, and GN is 1. Then there exist thresholds xN, xS, yN, and yS in [0, 1) such that for almost all xN ≤ x ≤ 1, xS ≤ x΄ ≤ 1, yN ≤ y ≤ 1, and yS ≤ y΄ ≤ 1,   \begin{equation*} p_{N}\left( x\right) =q_{N}\left( y\right) =0\,\quad{\textit {and}}\quad\,p_{S}\left( x^{\prime }\right) =q_{S}\left( y^{\prime }\right) =1. \end{equation*} $$\textit{Proof}$$. See Appendix Proposition 3 states that nonsmokers with high enough SES marry nonsmokers with probability 1: marrying a smoker would decrease the surplus by a factor λ, which for high SES can only decrease total surplus. Similarly, smokers with high enough SES only marry smokers. However, the previous result is only true at the top of the SES distribution; further down, randomization may appear at a stable matching. To see why, assume that the distributions of the male and female populations are such that men are much more likely to smoke than women. The assortative pattern described in Proposition 2, together with the measure restrictions, implies that both nonsmoking wives and smoking husbands, being on the long side of the market, will have to “marry down” (i.e., a spouse with relatively lower SES). At some point, the marginal nonsmoking wife may become indifferent between marrying a nonsmoker or the marginal smoking husband, because the resulting loss in total surplus (due to λ < 1) is exactly offset by the higher SES of the latter. Note that, in particular, male smokers who match with female nonsmokers must have a higher socioeconomic status than their wife; indeed, the only reason why a female nonsmoker would agree to marry a smoker (thus accepting a shrinkage of total surplus) is the shortage of male nonsmokers. This particular pattern is not directly linked to the multiplicative nature of the impact of smoking; as soon as marrying a smoker involves a loss in surplus, we expect similar patterns. What is specific of the multiplicative setting, however, is that these phenomena do not take place at the top of the distribution—where the loss is larger. This same logic also suggests, however, that in a given neighborhood, all forms of randomization are not simultaneously possible. In our example, for instance, although female nonsmokers may want to marry a smoking spouse, male nonsmokers would not, since they would lose a share λ of the surplus and the opportunity to “marry up”. This feature is indeed general and is expressed by the following result. Proposition 4. Assume there exists an open set O such that, for all x ∈ O, 0 < pN(x) < 1—so that x marries either a nonsmoker y or a smoker y΄ with positive probability. Assume moreover that qS(y΄) > 0—so that y΄also marries a smoker x΄ with positive probability. Then qN(y) = 0 and pS(x΄) = 1 almost surely. Moreover, x΄ = x and y΄ > y. Similarly, if for all y in some open set O΄, 0 < qN(y) < 1—so that y marries either a nonsmoker x or a smoker x΄ with positive probability– and pS(x΄) > 0—so that x΄ also marries a smoker y΄ with positive probability, then pN(x) = 0 and qS(y΄) = 1 almost surely. Moreover, x΄ > x and y΄ = y. $$\textit{Proof}$$. See Appendix Proposition 4 states that the various types of randomizations are mutually exclusive. In the neighborhood of some given SES, it may be the case that female nonsmokers and male smokers intermarry with positive probability; but then, in this same neighborhood, female smokers and male nonsmokers only marry their own type. Of course, the pattern may be opposite in a different neighborhood10; ultimately, the matching patterns depend on the distributions F and G. 2.4.2. Surplus Maximization We now describe the mathematical form of the problem, and indicate a resolution strategy that works for arbitrary distributions. Start with the constraint that the marginals of the stable measure μ must coincide with the female and male population measures. Proposition 2 greatly simplifies the expression of these constraints, as it is demonstrated in the following points. Consider a nonsmoker female with SES x, matched with a nonsmoker male with SES y = ϕN(x). Then the number of nonsmoker females with SES higher than x, married with a nonsmoker, must equal the number of nonsmoker males with SES higher than y, married with a nonsmoker,   $$\int _{x}^{1}\left( 1-p_{N}\left( t\right) \right) dF_{N}\left( t\right) =\int _{\phi _{N}\left( x\right) }^{1}\left( 1-q_{N}\left( t\right) \right) dG_{N}\left( t\right),$$ (4)which defines the function ϕN. Similarly, for a nonsmoker female with SES x marrying a smoker with SES y΄ = ψN(x),   $$\int _{x}^{1}p_{N}\left( t\right) dF_{N}\left( t\right) =\int _{\psi _{N}\left( x\right) }^{1}\left( 1-q_{S}\left( t\right) \right) dG_{S}\left( t\right),$$ (5)which defines ψN. For the other combinations of smoking status   \begin{align} \int _{x}^{1}\left( 1-p_{S}\left( t\right) \right) dF_{S}\left( t\right) =\int _{\phi _{S}\left( x\right) }^{1}q_{N}\left( t\right) dG_{N}\left( t\right), \end{align} (6)  \begin{align} \int _{x}^{1}p_{S}\left( t\right) dF_{S}\left( t\right) =\int _{\psi _{S}\left( x\right) }^{1}q_{S}\left( t\right) dG_{S}\left( t\right). \end{align} (7) In particular, there exists a one-to-one relationship between the four probability functions (pN, pS, qN, qS) and the four matching functions (ϕN, ϕS, ψN, ψS). Clearly, the exact marital patterns characterizing the stable matching depend on the joint distribution of SES and smoking status of the two populations. Here, we provide the general tool that can be used to solve the problem with arbitrary distributions. The idea, again, is to exploit the duality between stability and surplus maximization. With the previous notations, aggregate surplus is   \begin{align} \Sigma & =\int _{0}^{1}\left[ \left( 1-p_{N}\left( t\right) \right) S\left( t,\phi _{N}\left( t\right) \right) +\lambda p_{N}\left( t\right) S\left( t,\psi _{N}\left( t\right) \right) \right] dF_{N}\left( t\right) \\ &\quad {} +\lambda \int _{0}^{1}\left[ \left( 1-p_{S}\left( t\right) \right) S\left( t,\phi _{S}\left( t\right) \right) +p_{S}\left( t\right) S\left( t,\psi _{S}\left( t\right) \right) \right] dF_{S}\left( t\right). \nonumber \end{align} (8) The first integral considers the contribution of the female nonsmoker population. An individual with SES t may (with probability pN(t)) be matched with a smoker with SES ψN(t), in which case the couple generates a surplus λS(t, ψN(t)); alternatively, she may (with probability 1 − pN(t)) be matched with a nonsmoker with SES ϕN(t), generating a surplus S(t, ϕN(t)). Similarly, the second integral represents the contribution of the female smoker population to total surplus. A stable matching, defined by the functions pN, pS, qN, qS and ϕN, ϕS, ψN, ψS, linked by (4)–(7), maximizes aggregate surplus under the constraints   \begin{equation*} 0 \leq p_{A}(t) \leq 1,0 \leq q_{A}(t) \leq 1,0 \leq \phi _{A}(t) \leq 1,0 \leq \psi _{A}(t) \leq 1, \end{equation*} where A = N, S. The stable matching can therefore be derived as a solution to a maximization (optimal control) problem. Finally, once the functions pN, pS, qN, qS (or equivalently ϕN, ϕS, ψN, ψS), which define the stable matching, have been computed, one can readily recover the intracouple allocation of the surplus (see Appendix). 2.5. The Constrained General Case We now introduce the additional assumption that “men smoke more than women”, in the following sense. Assumption MP (Male Prevalence). $${For\, any\, SES\, level}$$x,   \begin{equation*} F_{S}\left( x\right) \ge G_{S}\left( x\right). \end{equation*} In words, for any SES x, there are more male than female smokers with SES x or larger. This may describe a situation in which men are more likely to smoke, but also one in which men and women have similar smoking habits but the male SES distribution first degree dominates the female one, as well as many others. As we shall see in the empirical part, the MP Assumption is satisfied in the data. Under that assumption, it is possible to characterize the stable matching as follows. Proposition 5. Assume that the upper bound of the support of the measures FS, FN, GS and GN is 1. Then there exist thresholds xN, xS, yN and yS in [0, 1) such that, for almost all xN ≤ x ≤ 1, xS ≤ x΄ ≤ 1, yN ≤ y ≤ 1 and yS ≤ y΄ ≤ 1,   \begin{equation*} p_{N}\left( x\right) =q_{N}\left( y\right) =0\quad{\textit{and}}\quad p_{S}\left( x^{\prime }\right) =q_{S}\left( y^{\prime }\right) =1; \end{equation*} for almost all x < xN, x΄ < xS, y < yN and y΄ < yS,   \begin{equation*} 0\le p_{N}\left( x\right) \le 1,0\le q_{S}\left( y^{\prime }\right) \le 1,q_{N}\left( y\right) =0{\textit{and}}p_{S}\left( x^{\prime }\right) =1; \end{equation*} for any x such that pN(x) > 0, let ϕN(x) and ψN(x), respectively, denote the nonsmoker and smoker mate of (x, N). Then   \begin{equation*} \psi _{N}\left( x\right) >\phi _{N}\left( x\right); \end{equation*} for any y΄ such that qS (y΄) < 1, let ϕS (y΄) and ψS (y΄), respectively, denote the nonsmoker and smoker mate of (y΄, S). Then   \begin{equation*} \phi _{S}\left( y^{\prime }\right) =\psi _{S}\left( y^{\prime }\right). \end{equation*} $$\textit{Proof}$$. See Appendix Proposition 5 first states that high SES people are exclusively matched to partners with identical smoking habits—a result already derived by Proposition 3. For lower SES, randomization may happen, as Proposition 4 indicates, but only in one direction: nonsmoking women may marry smokers, but smoking women and nonsmoking men always marry their own. Lastly, when a female nonsmoker may marry both a smoker and a nonsmoker with positive probability, the smoker has a higher SES than the nonsmoker. A remark coming directly from the optimal control formulation sheds some light on the form of the randomization. The Hamiltonian corresponding to the program (8) is linear in pN (x), the coefficient being equal to λS (x, ψN(x)) − S(x, ϕN (x)), which is always nonpositive. For large values of x, ψN (x) and ϕN (x) are both close to 1, and since λ < 1 the coefficient is negative, implying pN (x) = 0. Randomization requires the coefficient to be zero; therefore, whenever randomization takes place, it must be the case that   \begin{equation*} S\left( x,\phi _{N}\left( x\right) \right) =\lambda S\left( x,\psi _{N}\left( x\right) \right) \end{equation*} (which, incidentally, implies ψN(x) > ϕN(x)). Since both ϕN(x) and ψN(x) depend on the probability pN(x) (by 4 and 5), pN(x) must be such that this equality holds over the relevant domain. In addition, the intracouple allocation of the surplus can be recovered (see Appendix). Finally, we note that where exactly randomization occurs, and with which probability, depends on additional assumptions on the exact form of the surplus function and the two distributions. As an illustration, we provide an example in the Supplementary Material, where (i) smoking status and SES are independent, (ii) men are more likely to be smokers than women at all SES levels, and (iii) the surplus S is quadratic and symmetric. In that case, the program can be solved in closed form. There is a threshold below which nonsmoking women randomize their mates; moreover, the probability does not depend on the SES. More complex patterns can be constructed by varying the distributions of characteristics. 2.5.1. Theoretical Predictions In summary, the theoretical predictions of the constrained general case are as follows, Prediction 1. There are no “mixed” couples in which she smokes and he does not, whereas the opposite pattern—he smokes and she does not—happens with positive probability. Prediction 2. Smoking husbands married to smokers are of higher “quality” (i.e., higher SES) than those married to nonsmokers: smoking “premium” for smoking wives. Prediction 3. Nonsmoking wives married to smoking husbands have a lower SES than those married to nonsmokers: smoking “penalty” for smoking husbands. Prediction 4. There is positive assortative matching on SES among couples with identical smoking habits. Prediction 5. Positive assortative matching on SES by smoking status is stronger at the top of the SES distribution: smoking husbands (wives) at the top of the SES distribution are more likely to marry smoking wives (husbands) than those with lower SES. Similarly, nonsmoking husbands (wives) at the top of the SES distribution are more likely to marry nonsmoking wives (husbands) than those with lower SES. Prediction 6. When a female nonsmoker marries both a smoker and a nonsmoker with positive probability, the smoking husband has a higher SES than the nonsmoking one. When a male smoker marries both a smoker and a nonsmoker with positive probability, both women have the same SES. 3. Data Description 3.1. CPS March and TUS data Our empirical application uses years of education as the measure of SES and smoking status for the binary attribute. We use data from the US Current Population Survey (CPS), specifically its annual March supplements and the Tobacco Use Supplements (TUS), for the years 1996–2003, which provide the largest samples of married couples for whom information on tobacco use is available. The standard demographic and education variables are extracted from the annual March CPS supplements, to which data on smoking status are merged from the TUS. The TUS are monthly CPS supplements available discontinuously over time and in different months. Specifically, the available TUS of interest are January and May 1996, 1999, 2000; June 2001; February 2002; and February and June 2003. The CPS is a series of monthly cross sections, with a short longitudinal component. Individuals in the sample are interviewed eight times–four times, followed by a break of eight months, and then interviewed for the same four months the following year. As such, it is possible to match observations of the same individuals across months, using the household and person identification codes, along with the month-in-sample information. However, several observations are dropped due to the specific design of the rotation samples by 4-month periods. In addition, we also check for age, gender, and race, to ascertain that the merged observations consistently belong to the same individual.11 The TUS-CPS is a National Cancer Institute sponsored survey of tobacco use and policy information that has been administered as part of the CPS since 1992. It is considered a key and reliable source of national, state, and substate level data on smoking and other tobacco use in the US households, which are widely used in medical research on cancer and other consequences of smoking (e.g., Delnevo and Bauer 2009; Mills et al. 2009). It provides data on a nationally representative sample of about 240,000 civilian, noninstitutionalized individuals aged 15 years and older. We are able to match individuals across months, merging the TUS supplements back to the March supplement of the corresponding year, to build a series of repeated cross-sections for the years 1996, 1999, 2000, 2001, 2002, and 2003. Due to the CPS rotation sample design described above, the sample size of each match is at most 1/4, 1/2 , or 3/4, of the original March sample size (when matched to June, January and May, or February, respectively). In general, the farther from March the TUS supplement month is, the fewer observations can be matched, with the strong restriction that the TUS months of September (1992, 1995, 1998), and November (2001 and 2003) cannot be merged back to March, as they do not share any respondent (Madrian and Lefgren 1999). Nevertheless, our sample is large, with detailed socioeconomic and smoking information on both spouses, and (to the best of our knowledge) it is the first time it is used to study marriage and smoking. We specifically extract husbands and wives from our merged CPS files. Married individual records of the reference person and her spouse are then matched on the household identification code (and household number) to create a single observation for each couple, keeping only observations of couples who lived in households with only one family. Our main sample of husbands and wives consists of white couples, where the wife is between 22 and 32 years old and the husband is between 24 and 34 years old. This demographic group allows us to focus on recently married couples. In the United States the median age at first marriage was around 27 for men and 25 for women for the period 1996–2007. On the other hand, a lower bound of 22 and 24 years old also allows us to include college graduates after they have completed their schooling. The additional two years in the husbands’ bounds are based on the standard median/mean age difference of two years between male and female spouse (Chiappori, Iyigun, and Weiss 2009). Note that the March CPS does not record the duration of marriage; in particular, the June Fertility Supplements that used to provide the age at (first) marriage, do not contain it any longer in the most recent years that our study is concerned about. In addition to individual age, we use the state of residence, year of interview, sample household weight and education of the individual. From 1992, the CPS records education as degrees attained rather than years of schooling completed, we thus assign the number of years of schooling to the corresponding degrees. We will also consider information on self-reported health status, number of children under age six and male-head indicator. March CPS household weights are used to make our sample of couples representative of the US population. From the Tobacco Use Supplement, we retrieve information on the smoking status of each individual. Specifically, we focus on the respondents’ answers to whether and how often they smoke, whether they have smoked at least 100 cigarettes in their lifetime, and at which age they started smoking regularly. From the first two questions, we construct a dummy variable of smoking status, defining a person as a smoker if she reports to smoke every day or some days, and has smoked at least 100 cigarettes in her lifetime, and as nonsmokers those who say that they never smoke, or those who have smoked less than 100 cigarettes in their lifetime (CDC definition, NCHS 2010). Age at which they started smoking regularly, as a predetermined explanatory variable of current smoking status with respect to marital status, will prove useful in dealing with the potential endogeneity of smoking with respect to marital status. Finally, note that each spouse directly reports his/her information, and self-reporting of smoking status is considered a reliable source of information, as it is found to be validated by measured serum cotinine levels (Caraballo et al. 2001). 3.2. Smoking Prevalence by Gender An interesting aspect of the smoking attribute is the asymmetry in the smoking prevalence across genders. In the United States, and actually in many countries, male smokers largely outnumber female smokers, a discrepancy that has remained stable over the last decades. This gender asymmetry has been emphasized by the Surgeon General (e.g., Surgeon General’s Report 2001), as well as by several studies in various fields (e.g., Gruber 2001 in economics and Öberg et al. 2011 in medicine). Although a negative smoking gradient is observed by education (e.g., Gruber 2001), the gender gap in smoking prevalence is maintained across all education levels. For instance, in 2007, the prevalence of smokers by educational level among white men and women 25 years of age and over were as follows: 30.8% versus 23.9% for those with less than high-school; 29.9% versus 25.2% for those with high-school; 21.8% versus 19.6% for those with some college; and 10.5% versus 8.2% for those with college or above (NCHS 2010). We further investigate this gender smoking asymmetry by computing the number of smokers by gender and educational attainment (years of education) in our data. Figure 1 shows that the number of male smokers is always higher than the number of female smokers, for each educational attainment (years of education). Hence, the Assumption MP is satisfied in our data. Figure 1. View largeDownload slide Number of smokers by educational level and gender. Married. Figure 1. View largeDownload slide Number of smokers by educational level and gender. Married. 3.3. Descriptive Statistics Table 1 displays the summary statistics of the main variables of interest in our sample. Panel A reveals that women are slightly more educated and slightly less likely to be in excellent or very good health than their husbands, whereas their average number of children under six years old is about 0.8. Panel B reports the same statistics for never married individuals, showing again that men are healthier and less educated than women, and with a negligible number of young children. Finally, note the asymmetry in the smoking prevalence across genders: the smoking prevalence is 22% for husbands versus 17% for wives (29% and 26% for never-married men and women). Table 1. Summary statistics. Means (standard deviations)      A. Married  Husbands  Wives  Age (years)  29.48  27.81    (2.80)  (2.78)  Education (years)  13.65  13.79    (2.37)  (2.30)  Smoke (=1 if every day/some day smoker and 100 cigarettes in their lifetime)  0.22  0.17    (0.42)  (0.38)  Very healthy (=1 if excellent or very good self-reported health)  0.83  0.80    (0.38)  (0.40)  Number of children under age 6  0.83    (0.85)  Number of couples  10,305  B. Never married  Men  Women  Age (years)  28.24  25.91    (3.12)  (3.06)  Education (years)  13.77  13.99    (2.28)  (2.22)  Smoke (=1 if every day / some day smoker and 100 cigarettes in their lifetime)  0.29  0.26    (0.45)  (0.44)  Very Healthy (=1 if excellent or very good self-reported health)  0.79  0.77    (0.41)  (0.42)  Number of children under age 6  0.04  0.15    (0.24)  (0.44)  Number of individuals  8,990  9,361  Means (standard deviations)      A. Married  Husbands  Wives  Age (years)  29.48  27.81    (2.80)  (2.78)  Education (years)  13.65  13.79    (2.37)  (2.30)  Smoke (=1 if every day/some day smoker and 100 cigarettes in their lifetime)  0.22  0.17    (0.42)  (0.38)  Very healthy (=1 if excellent or very good self-reported health)  0.83  0.80    (0.38)  (0.40)  Number of children under age 6  0.83    (0.85)  Number of couples  10,305  B. Never married  Men  Women  Age (years)  28.24  25.91    (3.12)  (3.06)  Education (years)  13.77  13.99    (2.28)  (2.22)  Smoke (=1 if every day / some day smoker and 100 cigarettes in their lifetime)  0.29  0.26    (0.45)  (0.44)  Very Healthy (=1 if excellent or very good self-reported health)  0.79  0.77    (0.41)  (0.42)  Number of children under age 6  0.04  0.15    (0.24)  (0.44)  Number of individuals  8,990  9,361  Notes: CPS 1996–2003, men aged 24–34, women aged 22–32. Sampling weights are used. View Large 4. Empirical Analysis 4.1. From Theory to Data It is well known by now that matching models are difficult to test from the sole observation of matching patterns. For instance, the Choo–Siow formulation, where the matching problem is analyzed as a series of discrete choice models, is exactly identified and therefore cannot be rejected by the data, even with its strong parametric assumption, that is, that unobserved heterogeneity follows a type 1 extreme value distribution. Here, we suggest that an alternative approach may be feasible. In more complex settings, including multidimensional contexts, matching theory can generate strong qualitative predictions. In some cases—including the present one—hypotheses based on these predictions can be tested in reduced form. Specifically, the theoretical analysis suggests that matching patterns should exhibit specific features due to the underlying competitive structure. These features could in principle be formally derived from an explicit, specific stochastic structure, which can be based on a Choo–Siow setting (following a line of research initiated by Graham 2011), but not in general; an alternative justification would involve a “rank order” property à la Fox (2010), or possibly search models. In any case, we argue that if the Assumption MP is satisfied, then we expect the following regularities to hold. Hypothesis 1 (Relative Prevalence of Mixed Couples). Mixed couples in which the wife smokes (denoted (0,1)) should be less frequent than those in which the husband smokes (denoted (1,0)). Actually, the Assumption MP by itself implies an asymmetry between the proportions of mixed couples: if there are more male than female smokers, then one expects more (1,0) couples. To take an easy benchmark, assume that the proportion of smokers in the male (resp. female) population is equal to sM (resp. sW) for all SES levels; then if matching were random with respect to smoking habits, the ratio of (0,1) to (1,0) couples should be   \begin{equation*} r_{\mathit {observed}}<r_{\mathit {random}}=\frac{s_{W}\left( 1-s_{M}\right) }{s_{M}\left( 1-s_{W}\right) } \end{equation*} In our total sample, sM is 0.22 and sW is 0.17, so the ratio r implied by the sole difference in relative smoking prevalence is around 0.72; we expect the observed ratio to be significantly smaller than this threshold. Moreover, the same exercise can be performed for each “marriage market” (or region). Hypothesis 2 (Smoking “Premium” for Smoking Wives). Among smoking husbands, those who marry smoking wives should have (on average) a higher SES than those who marry nonsmokers. Hypothesis 3 (Smoking “Penalty” for Smoking Husbands). Among nonsmoking wives, those who marry smoking husbands should have (on average) a lower SES than those who marry nonsmokers. Hypothesis 4 (Assortativeness by SES among couples with identical smoking habits). Among couples with identical smoking habits (i.e., both smokers or both nonsmokers), matching should be assortative on SES. Hypothesis 5 (Stronger Assortativeness by Smoking Status at the Top of the SES Distribution). Smoking husbands (wives) at the top of the SES distribution are more likely to marry smoking wives (husbands) than those with lower SES. Similarly, nonsmoking husbands (wives) at the top of the SES distribution are more likely to marry nonsmoking wives (husbands) than those with lower SES. Hypothesis 6 (Conditional versus Unconditional Correlation between Smoking and SES). When two nonsmoking women with the same SES marry, respectively, a smoker and a nonsmoker, the nonsmoker should be on average of lower SES than the smoker. That is, controlling for the (nonsmoking) wife’s SES, the smoking habit of the husband should be positively correlated with his SES. This result seems counterintuitive, since a well-established empirical fact is that the (unconditional) correlation between smoking and education is negative: low SES is an important predictor of smoking behavior (e.g., CDC 2010). In the theoretical framework we use, however, (perfect) assortative matching on SES implies that, for given smoking habits, the husband’s SES is fully determined by his wife’s. In reality, agents match on several characteristics (many of them unobservable), so that the wife’s SES is not a perfect control for the husband’s, meaning that the (unconditional) negative correlation between smoking and SES is likely to persist. We therefore restate the prediction as follows: the conditional correlation between male SES and smoking status, given the SES of nonsmoking wives, should be less negative than the unconditional one. Note that this pattern should not hold true for women. The remaining part of the paper is devoted to testing these hypotheses. 4.2. Testing the Hypotheses 4.2.1. Relative Prevalence of Mixed Couples Table 2 reports both the observed (Panel A) and random (Panel B) matching patterns by smoking status for husbands and wives. Panel A reveals that there is strong assortative mating by smoking status: about 71% of couples have nonsmoking spouses, and roughly 10% consists of smokers. This is in line with previous evidence on marital sorting by smoking status (e.g., Clark and Etilé 2006). In addition, there are fewer mixed couples where the wife smokes than vice versa, 6.7% versus 11.93%. These figures are very different from the percentages arising from random matching that are reported in Panel B, where sorting is weaker (the percentage of couples in the main diagonal is now less than 70%) and the two types of mixed couples are much more prevalent and similarly represented. Indeed, Panel C shows that the ratio of observed mixed couples where the wife smokes than vice versa is 0.56 (s.e. = 0.03), which is statistically significantly lower than the 0.72 (s.e. = 0.02) implied by the sole difference in relative smoking prevalence, consistently with our Hypothesis 1.12 As a last test, in Panel D, we perform the same exercise at the US Geographical Census division level (which consists of nine regions). The results are quite explicit: we find that the prediction is satisfied in each of these regions. Under the null of random matching, the probability of such an outcome would be extremely small, that is, 2−9≅2 × 10−3. Table 2. Matching patterns by smoking status. A. Observed  Nonsmoking wife  Smoking wife  Nonsmoking husband  70.93%  6.70%    (7,390)  (683)  Smoking husband  11.93%  10.44%    (1,211)  (1,021)  B. Random  Nonsmoking wife  Smoking wife  Nonsmoking husband  64.32%  13.30%  Smoking husband  18.53%  3.83%  C. Prevalence ratios of mixed couples  Observed  Random  Ratios  $$\frac{6.70}{11.93}=0.56$$  $$\frac{13.30}{18.53}=0.72$$    [0.03]  [0.02]  D. Prevalence ratios of mixed couples by Census Division  Observed  Random  Ratios in Census Division 1 (New England)  0.57  0.72    [0.12]  [0.08]  Ratios in Census Division 2 (Middle Atlantic)  0.63  0.73    [0.09]  [0.07]  Ratios in Census Division 3 (East North Central)  0.67  0.80    [0.09]  [0.06]  Ratios in Census Division 4 (West North Central)  0.62  0.75    [0.11]  [0.08]  Ratios in Census Division 5 (South Atlantic)  0.43  0.63    [0.07]  [0.05]  Ratios in Census Division 6 (East South Central)  0.58  0.75    [0.12]  [0.08]  Ratios in Census Division 7 (West South Central)  0.48  0.67    [0.08]  [0.06]  Ratios in Census Division 8 (Mountain)  0.55  0.70    [0.09]  [0.07]  Ratios in Census Division 9 (Pacific)  0.56  0.69    [0.10]  [0.08]  A. Observed  Nonsmoking wife  Smoking wife  Nonsmoking husband  70.93%  6.70%    (7,390)  (683)  Smoking husband  11.93%  10.44%    (1,211)  (1,021)  B. Random  Nonsmoking wife  Smoking wife  Nonsmoking husband  64.32%  13.30%  Smoking husband  18.53%  3.83%  C. Prevalence ratios of mixed couples  Observed  Random  Ratios  $$\frac{6.70}{11.93}=0.56$$  $$\frac{13.30}{18.53}=0.72$$    [0.03]  [0.02]  D. Prevalence ratios of mixed couples by Census Division  Observed  Random  Ratios in Census Division 1 (New England)  0.57  0.72    [0.12]  [0.08]  Ratios in Census Division 2 (Middle Atlantic)  0.63  0.73    [0.09]  [0.07]  Ratios in Census Division 3 (East North Central)  0.67  0.80    [0.09]  [0.06]  Ratios in Census Division 4 (West North Central)  0.62  0.75    [0.11]  [0.08]  Ratios in Census Division 5 (South Atlantic)  0.43  0.63    [0.07]  [0.05]  Ratios in Census Division 6 (East South Central)  0.58  0.75    [0.12]  [0.08]  Ratios in Census Division 7 (West South Central)  0.48  0.67    [0.08]  [0.06]  Ratios in Census Division 8 (Mountain)  0.55  0.70    [0.09]  [0.07]  Ratios in Census Division 9 (Pacific)  0.56  0.69    [0.10]  [0.08]  Note: Sampling weights are used. Weighted % and (nonweighted number of observations). “Delta method” standard errors in brackets. View Large 4.2.2. Smoking Premium and Smoking Penalty We investigate Hypotheses 2 and 3 in Tables 3 and 4, respectively. They contain a series of regressions in which either the husband’s or wife’s education is the dependent variable whereas spouse’s education and smoking status are the main explanatory variables, for the samples of smoking husbands or smoking wives in Table 3, and of nonsmoking husbands or nonsmoking wives in Table 4. Two specifications are presented: a standard one, with controls for own age, year and state fixed effects; and another one with additional controls, where we also include a spousal very healthy indicator, number of children under age six and a male-head indicator. Table 3. Regressions of education on spousal smoking status for smokers.   Sample    Smoking wives  Smoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.002  −0.006  0.139*  0.151*    (0.092)  (0.092)  (0.080)  (0.080)  Spouse’s education  0.471***  0.461***  0.555***  0.542***    (0.029)  (0.030)  (0.028)  (0.028)  Standard controls  Yes  Yes  Yes  Yes  Additional controls  No  Yes  No  Yes  N  1,704  1,704  2,232  2,232  R2  0.28  0.29  0.37  0.37    Sample    Smoking wives  Smoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.002  −0.006  0.139*  0.151*    (0.092)  (0.092)  (0.080)  (0.080)  Spouse’s education  0.471***  0.461***  0.555***  0.542***    (0.029)  (0.030)  (0.028)  (0.028)  Standard controls  Yes  Yes  Yes  Yes  Additional controls  No  Yes  No  Yes  N  1,704  1,704  2,232  2,232  R2  0.28  0.29  0.37  0.37  Note: Standard controls: own age, year, and state fixed effects. Additional controls: spousal very healthy indicator, number of children under age 6, and a male-head indicator. Sampling weights are used. Robust standard errors in parentheses.***p-value < 0.01; *p-value < 0.1. View Large Table 4. Regressions of education on spousal smoking status for nonsmokers.   Sample    Nonsmoking wives  Nonsmoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.133**  −0.108*  −0.235***  −0.208***    (0.063)  (0.063)  (0.080)  (0.080)  Spouse’s education  0.636***  0.608***  0.684***  0.669***    (0.013)  (0.013)  (0.014)  (0.015)  Standard controls  Yes  Yes  Yes  Yes  Additional controls  No  Yes  No  Yes  N  8,601  8,601  8,073  8,073  R2  0.48  0.49  0.46  0.47    Sample    Nonsmoking wives  Nonsmoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.133**  −0.108*  −0.235***  −0.208***    (0.063)  (0.063)  (0.080)  (0.080)  Spouse’s education  0.636***  0.608***  0.684***  0.669***    (0.013)  (0.013)  (0.014)  (0.015)  Standard controls  Yes  Yes  Yes  Yes  Additional controls  No  Yes  No  Yes  N  8,601  8,601  8,073  8,073  R2  0.48  0.49  0.46  0.47  Note: Standard controls: own age, year, and state fixed effects. Additional controls: spousal very healthy indicator, number of children under age 6 and a male-head indicator. Sampling weights are used. Robust standard errors in parentheses. ***p-value < 0.01; **p-value < 0.05; *p-value < 0.1. View Large Starting from the left-block of regressions, columns (1) and (2) show that among smoking wives there is no statistically significant difference in the average years of completed education between those who marry smoking men and those who marry nonsmoking ones, and that the point estimates are virtually zero. The action is concentrated in columns (3) and (4): among smoking husbands those who marry smoking women have on average 0.15 more years of completed education (or about a 1.2% higher annual earnings13) than those with nonsmoking ones. This supports our Hypothesis 2. Our theoretical analysis shows that, given the shortage of smoking women, smoking men who marry smoking women should be more educated, whereas no such effect should be observed for women, which is what we observe in the “placebo” columns (1) and (2). Smoking men have to compete for smoking partners with their own education. In other words, smoking women “marry up” with respect to their own education; they “benefit” from the fact that they are on the short side of the market by marrying a higher “quality” spouse. Note that the sign of the wife’s smoker coefficient in columns (3) and (4) is opposite to the well-known negative gradient between own smoking and own education. The evidence presented in Table 3 is also supportive of our assumptions on the surplus reduction due to smoking. If, for instance, smokers also preferred nonsmokers then we would not observe the positive coefficient of columns (3) and (4). By the same token, our empirical evidence also shows that a gender asymmetric perception of smoking cannot be the driving force behind the observed matching patterns. If men, regardless of their smoking status, perceived smoking in a woman as a defect, we would not observe the positive coefficient of columns (3) and (4) either. We now turn to investigate Hypothesis 3. Table 4 inquires about the average education among nonsmoking wives and nonsmoking husbands depending on the smoking status of their spouses. Consistent with Hypothesis 3, columns (1) and (2) show that among nonsmoking wives those with smoking husbands have on average 0.13–0.11 fewer years of completed education (or about a 0.9% lower annual earnings) than those with nonsmoking husbands. In other words, a smoking husband marries, on average, a worse nonsmoking spouse in terms of education than if he were to be a nonsmoker: smoking husbands, who are on the long side of the market, “marry down”, that is, they are penalized for their “handicap” with a lower “quality” spouse. This suggests that spousal smoking is a bad characteristic for nonsmokers, and that there is a marriage market “penalty” associated to it, in terms of lower socioeconomic standards. Finally, columns (3) and (4) show that among nonsmoking husbands those with nonsmoking wives have on average 0.24–0.21 more years of completed education than those with smoking wives. Thus, a nonsmoking wife marries, on average, a better nonsmoker spouse in terms of education. 4.2.3. Smoking Premium and Smoking Penalty controlling for the Propensity Score Overall, the estimates reported in Tables 3 and 4 are consistent with our model, and support hypotheses 2 and 3. A potential concern is that smoking status captures some characteristics that we do not observe in our data (e.g., personality traits) that would explain the above empirical findings. Although smoking is correlated with self-control and impatience, recent empirical evidence shows that some noncognitive skills and personality traits exhibit negative sorting in the marriage market, opposite to the strong positive sorting observed for smoking status. Dupuy and Galichon (2014) and Lundberg (2012) find that there is negative sorting in autonomy, that male extraversion is negatively correlated with female conscientiousness and risk attitudes, and male agreeableness is negatively correlated with female extraversion.14 Another concern is whether such empirical patterns (even when controlling for additional covariates) are the mere reflection of smoking behavior being endogenous to marital status, rather than a confirmation of the postulated marriage market mechanism. We explore such a possibility by using a simple and popular estimator in program evaluation (see Wooldridge 2002) obtained from an OLS regression of education on spousal smoking status controlling for the predicted probability that the spouse is a smoker (the propensity score). The spousal propensity score is the probability that the spouse is a smoker given a set of observable characteristics. To make this approach as appealing as possible (at least one of) the predictors must be predetermined with respect to marital status, and the age at which the spouse started smoking regularly seems a natural candidate. Hence, we predict the probability that the spouse j is a smoker ($$\mathit{Smoking}_{j}=1$$) given a set of spousal characteristics (Xj) using a probability model (Probit),   $$\mathit{Smoking}_{j}=\Phi (\boldsymbol {X}_{j}\boldsymbol {\beta })+\varepsilon _{j},$$ (9)where Φ is the standard normal cdf and Xj contains the following spousal characteristics: age (in categories), years of education, year fixed effects, state fixed effects, and age at which the spouse started smoking regularly. We code this last variable as 0 for never smokers or never regular smokers. For this reason, and the fact that people who start regularly using tobacco when they are younger are more likely to have trouble quitting than people who start later in life (Surgeon General’s Report 2012), we expect a nonlinear relationship between smoking status today and age at which the individual started smoking regularly, so that we also consider the square of this variable as an additional predictor. Once we have estimated the propensity scores $$\widehat{\mathit{Smoking}}_{j}$$ for men and women (see Table A1), we then estimate the following OLS regressions   $${\mathit{Education}}_{i}=\alpha +\delta {\mathit{Smoking}}_{j}+\gamma \widehat{\mathit {Smoking}}_{j}+u_{i}$$ (10)for four groups of individuals, smoking wives, smoking husbands, nonsmoking wives, and nonsmoking husbands, but also   $$\mathit{Education}_{i}=\alpha +\delta \mathit{Smoking}_{j}+\gamma _{1}\widehat{ \mathit{Smoking}}_{j} +\gamma _{2}\widehat{ \mathit{Smoking}}_{j}^{2}+u_{i}$$ (11)to account for potential nonlinearities of the conditional expectation function of education in the propensity score. The estimated propensity score plays the role of a control function. Reassuringly, the estimates displayed in Tables 5 and 6 give a similar picture to our main estimates in Tables 3 and 4, and are indeed consistent with existing research. Using information on engaged couples, newlyweds, and couples married for over 5 years, Sutton (1993) finds that similarities in smoking status were already present about the time of marriage. More recently, Banks et al. (2013), who use retrospective evidence from the Health and Retirement Study, show that most smoking behavior is initiated before marriage, so that smoking is not endogenous to marital status. Controlling for own age in equations (10) and (11) is immaterial for our findings (results available upon request). Table 5. Regressions of education on spousal smoking status for smokers. Controlling for Spousal Propensity to Smoke $$\widehat{P}( {\textit {Spouse}^{\prime }s}\ {\textit {characteristics}})$$    Sample    Smoking wives  Smoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.002  0.004  0.247*  0.243*    (0.150)  (0.164)  (0.130)  (0.138)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})$$  −0.581**  2.77***  −0.734***  3.17***    (0.243)  (0.797)  (0.207)  (0.707)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})^2$$  –  −4.86***  –  −5.87***      (0.962)    (0.897)  N  1,656  1,656  2,204  2,204  R2  0.01  0.03  0.01  0.03  Controlling for Spousal Propensity to Smoke $$\widehat{P}( {\textit {Spouse}^{\prime }s}\ {\textit {characteristics}})$$    Sample    Smoking wives  Smoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.002  0.004  0.247*  0.243*    (0.150)  (0.164)  (0.130)  (0.138)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})$$  −0.581**  2.77***  −0.734***  3.17***    (0.243)  (0.797)  (0.207)  (0.707)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})^2$$  –  −4.86***  –  −5.87***      (0.962)    (0.897)  N  1,656  1,656  2,204  2,204  R2  0.01  0.03  0.01  0.03  Note: $$\widehat{P}( \mathit{Spouse^{\prime}s characteristics})$$ is the predicted probability of a spouse smoking status using a Probit. $$\mathit{Spouse^{\prime }s characteristics}$$: age (in categories), years of education, age when he/she started smoking, and its square (in columns (2) and (4)), year, and state fixed effects. Sampling weights are used. Robust standard errors in parentheses. ***p-value < 0.01; **p-value < 0.05; *p-value < 0.1. View Large Table 6. Regressions of education on spousal smoking status for nonsmokers. Controlling for Spousal Propensity to Smoke $$\widehat{P}( {\textit {Spouse}^{\prime }s\ {\textit {characteristics}})}$$    Sample    Nonsmoking wives  Nonsmoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.228**  −0.237**  −0.462***  −0.446***    (0.101)  (0.096)  (0.112)  (0.111)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})$$  −1.42***  2.46***  −1.00***  0.709    (0.132)  (0.565)  (0.145)  (0.539)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})^2$$  –  −5.67***  –  −2.70***      (0.851)    (0.835)  N  8,529  8,529  8,038  8,038  R2  0.04  0.04  0.02  0.02  Controlling for Spousal Propensity to Smoke $$\widehat{P}( {\textit {Spouse}^{\prime }s\ {\textit {characteristics}})}$$    Sample    Nonsmoking wives  Nonsmoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.228**  −0.237**  −0.462***  −0.446***    (0.101)  (0.096)  (0.112)  (0.111)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})$$  −1.42***  2.46***  −1.00***  0.709    (0.132)  (0.565)  (0.145)  (0.539)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})^2$$  –  −5.67***  –  −2.70***      (0.851)    (0.835)  N  8,529  8,529  8,038  8,038  R2  0.04  0.04  0.02  0.02  Note: $$\widehat{P}(\mathit{Spouse^{\prime}s characteristics})$$ is the predicted probability of a spouse smoking status using a Probit. $$\mathit{Spouse^{\prime }s characteristics}$$: age (in categories), years of education, age when he/she started smoking, and its square (in columns (2) and (4)), year, and state fixed effects. Sampling weights are used. Robust standard errors in parentheses. ***p-value < 0.01; **p-value < 0.05. View Large 4.2.4. Assortativeness by SES among Couples with Identical Smoking Habits In Table 7 we investigate the degree of assortativeness by education among different types of couples depending on spouses’ smoking status. For each type of couple, we regress own education on spouse’s education controlling for own age, year, and state fixed effects. Our estimates reveal positive assortative matching by education for each type of couple. Although assortative mating by education has been extensively documented in the literature (Lam 1988; Mare 2008; Pencavel 1998; Qian 1998), here we show that it holds true within each spouses’ smoking category. This is consistent with Hypothesis 4. Table 7. Assortative matching by education. Regressions of education on spousal education by type of couple.    Both nonsmokers    Both smokers    Wife’s education  Husband’s education    Wife’s education  Husband’s education  Spouse’s education  0.641***  0.695***    0.464***  0.440***    (0.014)  (0.015)    (0.037)  (0.035)  N  7,390  7,390    1,021  1,021  R2  0.47  0.47    0.27  0.28    Smoking husband    Nonsmoking husband    Nonsmoking wife    Smoking wife    Wife’s  Husband’s    Wife’s  Husband’s    education  education    education  education  Spouse’s education  0.604***  0.619***    0.482***  0.488***    (0.034)  (0.040)    (0.047)  (0.044)  N  1,211  1,211    683  683  R2  0.44  0.44    0.33  0.31  Regressions of education on spousal education by type of couple.    Both nonsmokers    Both smokers    Wife’s education  Husband’s education    Wife’s education  Husband’s education  Spouse’s education  0.641***  0.695***    0.464***  0.440***    (0.014)  (0.015)    (0.037)  (0.035)  N  7,390  7,390    1,021  1,021  R2  0.47  0.47    0.27  0.28    Smoking husband    Nonsmoking husband    Nonsmoking wife    Smoking wife    Wife’s  Husband’s    Wife’s  Husband’s    education  education    education  education  Spouse’s education  0.604***  0.619***    0.482***  0.488***    (0.034)  (0.040)    (0.047)  (0.044)  N  1,211  1,211    683  683  R2  0.44  0.44    0.33  0.31  Notes: All regressions include own age, year, and state fixed effects. Sampling weights are used. Robust standard errors in parentheses. ***p-value < 0.01. View Large 4.2.5. Stronger Assortativeness by Smoking Status at the Top of the SES distribution In Table 8 we investigate how the degree of assortativeness by smoking status varies across the SES distribution. We run regressions of a spousal smoking status indicator on individual smoking status, individual SES, and the interaction of these two variables. We measure SES in two different ways: years of schooling and high-school degree or above (12 years of schooling or more). The results of these regressions suggest that, if anything, “mixed” couples (i.e., where one spouse is a smoker and the other is not) are less prevalent at high SES, supporting Hypothesis 5. Table 8. Assortativeness in smoking status by SES. Regressions of spousal smoking status    Wife is a smoker    Husband is a smoker    (1)  (2)    (3)  (4)  Education  −0.014***  –    −0.022***  –    (0.001)      (0.002)    Smoker  0.239***  0.311***    0.389***  0.375***    (0.074)  (0.032)    (0.103)  (0.045)  Education × smoker  0.009  –    0.004  –    (0.006)      (0.008)    High school (or more)  –  −0.034**    –  −0.105***      (0.016)      (0.020)  High school (or more) × smoker  –  0.073**    –  0.092*      (0.035)      (0.047)  N  10,305  10,305    10,305  10,305  R2  0.20  0.19    0.20  0.19  Regressions of spousal smoking status    Wife is a smoker    Husband is a smoker    (1)  (2)    (3)  (4)  Education  −0.014***  –    −0.022***  –    (0.001)      (0.002)    Smoker  0.239***  0.311***    0.389***  0.375***    (0.074)  (0.032)    (0.103)  (0.045)  Education × smoker  0.009  –    0.004  –    (0.006)      (0.008)    High school (or more)  –  −0.034**    –  −0.105***      (0.016)      (0.020)  High school (or more) × smoker  –  0.073**    –  0.092*      (0.035)      (0.047)  N  10,305  10,305    10,305  10,305  R2  0.20  0.19    0.20  0.19  Note: All regressions include spouse’s age, year, and state fixed effects. Sampling weights are used. Robust standard errors in parentheses. ***p-value < 0.01; **p-value < 0.05; *p-value < 0.1. View Large 4.2.6. Conditional versus Unconditional Correlations between Smoking and SES Finally, in Table 9 we run regressions of education on smoking status for two samples, nonsmoking women and smoking men, controlling or not for spousal SES. Our findings support Hypothesis 6. Among men married to nonsmoking women, smoking men tend to have (on average) a lower SES (1.3 fewer years of education) but this difference decreases (to 0.7 years less of education) when controlling for their wives’ SES. Among women married to smoking men, smoking women tend to have (on average) a lower SES (around 0.3 fewer years of education) regardless of their husbands’ SES. Table 9. Conditional and unconditional correlations. Regressions of education on smoking status    Sample    Nonsmoking women    Smoking men  Dependent variable:  Husband’s education    Wife’s education    Unconditional  Conditional    Unconditional  Conditional  Smoker  −1.34***  −0.657***    −0.339***  −0.314***    (0.076)  (0.063)    (0.094)  (0.080)  Spouse’s education  –  0.683***    –  0.559***      (0.014)      (0.025)  Adjusted Wald test            Test of equality (p-value)  0.0000    0.6291  N  8,601  8,601    2,232  2,232  Regressions of education on smoking status    Sample    Nonsmoking women    Smoking men  Dependent variable:  Husband’s education    Wife’s education    Unconditional  Conditional    Unconditional  Conditional  Smoker  −1.34***  −0.657***    −0.339***  −0.314***    (0.076)  (0.063)    (0.094)  (0.080)  Spouse’s education  –  0.683***    –  0.559***      (0.014)      (0.025)  Adjusted Wald test            Test of equality (p-value)  0.0000    0.6291  N  8,601  8,601    2,232  2,232  Note: All regressions include own age, year and state fixed effects. Sampling weights are used. Robust standard errors in parentheses. *** p-value < 0.01. View Large Table A.1. Propensity to smoke. Probit models of smoking status on explanatory variables    Wives    Husbands    (1)  (2)    (3)  (4)  Age individual started smoking regularly  0.146***  0.275***    0.156***  0.290***    (0.003)  (0.011)    (0.003)  (0.013)  Age individual started smoking regularly2  —  −0.006***    —  −0.007***      (0.0005)      (0.0006)  Education  −0.126***  −0.108***    −0.138***  −0.127***    (0.010)  (0.011)    (0.010)  (0.012)  Age fixed effects  Yes  Yes    Yes  Yes  Year fixed effects  Yes  Yes    Yes  Yes  State fixed effects  Yes  Yes    Yes  Yes  N  10,185  10,185    10,242  10,242  Pseudo-R2  0.51  0.53    0.54  0.55  Probit models of smoking status on explanatory variables    Wives    Husbands    (1)  (2)    (3)  (4)  Age individual started smoking regularly  0.146***  0.275***    0.156***  0.290***    (0.003)  (0.011)    (0.003)  (0.013)  Age individual started smoking regularly2  —  −0.006***    —  −0.007***      (0.0005)      (0.0006)  Education  −0.126***  −0.108***    −0.138***  −0.127***    (0.010)  (0.011)    (0.010)  (0.012)  Age fixed effects  Yes  Yes    Yes  Yes  Year fixed effects  Yes  Yes    Yes  Yes  State fixed effects  Yes  Yes    Yes  Yes  N  10,185  10,185    10,242  10,242  Pseudo-R2  0.51  0.53    0.54  0.55  Note: Sampling weights are used. Robust standard errors in parentheses. ***p-value < 0.01. View Large 4.3. Refutability Tests We implement two types of refutability tests (see Angrist and Krueger 2001). We first provide placebo tests based on the idea that the predictions of our model should not be borne out in the data when married men and women are randomly assigned to couples. In our second approach, we test the proportionality constraints implied by an alternative single-index model. 4.3.1. Placebo Tests We reshuffle our observed married individuals into new randomly married couples. The way we do it is in two stages. We first keep married men in our sample and generate a new id drawn from a uniform distribution for each of these men, and then rank them according to that pseudorandom id. We do the same for married women: we generate a new id drawn from a uniform distribution for each of these women, and then rank them according to that pseudorandom id. We then merge men and women according to that pseudorandom id. This constitutes our random matching sample. After that, we investigate whether the hypotheses based on the predictions of our model are observed in these randomly generated data. The answer is clearly negative. Table G.1 in the Supplementary Material shows that with randomly generated couples, it is not true that among smoking husbands those who marry smoking wives have (on average) a higher SES. Table G.2 in the Supplementary Material shows that with randomly generated couples, it is not true that among nonsmoking wives, those who marry smoking husbands have (on average) a lower SES. Table G.3 in the Supplementary Material shows that with randomly generated couples, it is not true that “mixed” couples are less prevalent at high levels of SES. Finally, Table G.4 in the Supplementary Material reveals that with randomly generated couples, Hypothesis 6 is not borne out in the data. All in all, it seems that our model cannot explain the marriage patterns of randomly generated couples. 4.3.2. Alternative Single-Index Model Our bidimensional model of smoking and SES rules out a single index representation: the trade-off between these two characteristics is perceived differently among potential spouses, which is precisely what index models forbid. Chiappori et al. (2012) have shown that index models are testable; they must satisfy a set of proportionality restrictions. In Table G.5 in the Supplementary Material, we show that these predictions are rejected by the data when the two dimensions are smoking status and SES, providing additional support for the contention that, albeit very appealing, the single-index model representation is not always appropriate, and that, in particular, it does not apply to our case. 5. Discussion Our model parsimoniously assumes that the presence of a smoking spouse decreases surplus by the same factor irrespective of the partner’s smoking status. This assumption could be relaxed; one could assume, for instance, that the surplus discount is larger for mixed couples. The corresponding, more general model would still be solvable using the same technique. Most general conclusions would remain valid. In particular, there would be assortative matching at the top of the distribution; if male smokers outnumbered female ones, then mixed couples would consist of a male smoker and a nonsmoking wife; and in the latter case the wife would “marry up”. However, other predictions would be lost. For instance, the last statement of Proposition 5 would no longer hold; when a smoking man randomizes between two potential spouses, they will not have the same socioeconomic index (although, interestingly, they do in the data, see Table 6). In addition, the closed-form solution of the uniform-quadratic example provided in Supplementary Material would no longer be valid; its resolution, although still feasible, would become more complex (and tedious). For instance, it would require considering more cases. All in all, the new, and less tractable, model, would lose in empirical content (although the predictions that would be discarded are supported by the data), with no significant gain in terms of its basic insights. We therefore believe that parsimony considerations suggest concentrating on the single λ case. 6. Conclusions We develop a bidimensional frictionless matching model on the marriage market under transferable utility, where individuals are characterized by a continuous trait (e.g., socioeconomic status) and a binary attribute (e.g., smoking status) that is heterogeneously (dis)liked in the population. As long as traits are not perfectly correlated, this heterogeneity is key: it rules out a single-index representation. The trade-off between the two characteristics is perceived differently among potential spouses, which is precisely what index models forbid. That is, the corresponding one-dimensional matching techniques are not well-suited to characterize the stable matches in this setting, and this paper precisely characterizes a specific extension to allow for heterogeneous preferences. The main message of our paper is twofold. First, specific multidimensional models of matching, although intrinsically more complex than one-dimensional ones, are by no means intractable. We actually describe a general strategy for tackling problems of this type, and in particular how the characterization of the equilibrium can be formulated as an optimal control problem; we also show how such a theoretical approach can generate strong predictions on matching patterns. Although we concentrate on smoking in our empirical application, other aspects could readily be considered. Race or ethnicity are important examples (e.g., Chiappori, Oreffice, and Quintana-Domeque 2016), or characteristics such as age or health attributes. Secondly, although one-dimensional models are difficult to test, at least in the absence of additional information on either match-specific surpluses or transfers between agents, in more complex settings, and particularly in multidimensional contexts, the strong qualitative predictions that matching theory generates can be taken to the data. Hypotheses based on these predictions can be tested in reduced form. In our case, the model predicts a series of specific patterns (from the relative scarcity of couples where the wife smokes and the husband does not, to the opposite association of a smoking spouse on one’s education for wives and husbands) that are difficult to justify otherwise, and can be directly tested in a simple and robust way. We do not view such a strategy as a substitute for more explicitly structural approaches, but we believe it provides an informative complement. Notes The editor in charge of this paper was Juuso Välimäki. Acknowledgements We thank Juuso Välimäki (the editor) and two anonymous reviewers for helpful comments and suggestions. We also thank the participants at the Family Economics Workshop RHUL 2014, “The Econometrics of Matching” invited session of the EEA 2013, SaM Mainz 2013, Paris Matching Workshop 2013, Royal Economic Society Meetings 2012, Barcelona MOVE Family Conference 2011, and seminars at Ecole Polytechnique, CEPS/INSTEAD, Lancaster University, LSE, University of Oxford, University of Chicago, Columbia University, University of Aarhus, Universitat d’Alacant, Universitat Pompeu Fabra and Universidad Carlos III de Madrid. Chiappori acknowledges financial support from the NSF (award # SES-1124277). Oreffice and Quintana-Domeque acknowledge financial support from the Spanish Ministry of Science and Innovation (ECO 2011-29751). The usual disclaimers apply. Appendix A: Proof of Proposition 3 Take some small ε > 0 such that   \begin{equation*} S\left( 1-\varepsilon ,1-\varepsilon \right) >\lambda S\left( 1,1\right). \end{equation*} Define η(ε) > 0 by   $$\int _{1-\varepsilon }^{1}dF_{N}\left( s\right) =\int _{1-\eta \left( \varepsilon \right) }^{1}dG_{N}\left( s\right) ,$$ (A.1)so that there are exactly as many nonsmoking men with SES above 1 − η(ε) as nonsmoking women with SES above 1 − ε. We claim that almost all female nonsmokers with SES at least 1 − ε are married with a male nonsmoker with SES at least 1 − η(ε) (note that (A.1) then implies that, conversely, almost all male nonsmokers with SES at least 1 − η(ε) are married with a female nonsmoker with SES at least 1 − ε). Assume not, then there exists a positive measure set O of female nonsmokers with SES at least 1 − ε married with a smoker. By (A.1), there must exist a set O΄ of identical measure gathering male nonsmokers with SES at least 1 − η(ε), who are $${not}$$ married with female nonsmokers with SES at least 1 − ε. Then either almost all males in O΄ are married with nonsmokers with SES less than 1 − ε, or a non-null subset of males in O΄ is matched with smokers. We start with the second case. Let x ∈ O, y her (smoking) match, and y΄ ∈ O΄ matched with a smoker x΄. Surplus is   \begin{equation*} \Sigma =\lambda S\left( x,y\right) +\lambda S\left( x^{\prime },y^{\prime }\right), \end{equation*} whereas matching x and y΄ would generate a surplus   \begin{equation*} \Sigma _{1}=S\left( x,y^{\prime }\right) +\lambda S\left( x^{\prime },y\right). \end{equation*} By definition of ε, Σ1 > Σ, a contradiction. Assume now that almost all males in O΄ are married with nonsmokers with SES less than 1 − ε. Let x ∈ O be matched with yS, smoker, whereas y΄ ∈ O΄ is matched with a nonsmoking wife x΄ < 1 − ε. The surplus generated is thus   \begin{equation*} \Sigma =S\left( x^{\prime },y^{\prime }\right) +\lambda S\left( x,y_{S} \right), \end{equation*} whereas mixing matches would generate   \begin{equation*} \Sigma _{1}=S\left( x,y^{\prime }\right) +\lambda S\left( x^{\prime } ,y_{S}\right). \end{equation*} Note that yS > 1 − η(ε), for otherwise   \begin{eqnarray*} \Sigma _{1}-\Sigma & =S\left( x,y^{\prime }\right) -S\left( x^{\prime },y^{\prime }\right) -\lambda \left( S\left( x,y_{S}\right) -S\left( x^{\prime },y_{S}\right) \right) \\ & >S\left( x,y^{\prime }\right) -S\left( x^{\prime },y^{\prime }\right) -\left( S\left( x,y_{S}\right) -S\left( x^{\prime },y_{S}\right) \right) >0 \end{eqnarray*} by supermodularity, which contradicts surplus maximization. Define   \begin{equation*} \phi \left( s\right) =S\left( x,s\right) -S\left( x^{\prime },s\right) \end{equation*} then ϕ is differentiable and strictly positive on [0, 1]. We have that   \begin{equation*} \left|\phi \left( y^{\prime }\right) -\phi \left( y_{S}\right) \right|\le \left|y^{\prime }-y_{S}\right|M, \end{equation*} where $$M=\sup _{\left[ 0,1\right] }\left|\phi ^{\prime }\right|$$, and where |y΄ − yS| ≤ η(ε). It follows that   \begin{equation*} \phi \left( y_{S}\right) \le \phi \left( y^{\prime }\right) +\eta \left( \varepsilon \right) M, \end{equation*} therefore   \begin{equation*} \Sigma _{1}-\Sigma =\phi \left( y^{\prime }\right) -\lambda \phi \left( y_{S}\right) \ge \left( 1-\lambda \right) \phi \left( y^{\prime }\right) -\lambda \eta \left( \varepsilon \right) M, \end{equation*} which is positive for ε small enough, a contradiction. Appendix B: Proof of Proposition 4 The proof relies on the following Lemma. We can now prove the proposition. By Lemma 1, y΄ > y and x = x΄. Assume that qN(y) > 0, that is, that y marries a smoker x″ with positive probability. Then x″ > x = x΄ by Lemma 1. But the couples (x″, y) and (x΄, y΄) generate a surplus   \begin{equation*} \Sigma =\lambda S\left( x^{\prime \prime },y\right) +\lambda S\left( x^{\prime },y^{\prime }\right), \end{equation*} whereas the mixed couples (x΄, y) and (x″, y΄) would generate a surplus   \begin{equation*} \Sigma _{1}=\lambda S\left( x^{\prime },y\right) +\lambda S\left( x^{\prime \prime },y^{\prime }\right) \end{equation*} and Σ1 > Σ by supermodularity of S; an open set of marriages satisfying this pattern would violate surplus maximization. Similarly, assume that pS(x΄) < 1, that is, that x΄ marries a nonsmoker $$\bar{y}$$ with positive probability. Then $$\bar{y}=y^{\prime }>y$$ by Lemma 1. The couples (x, y) and $$\left( x^{\prime },\bar{y}\right)$$ generate a surplus   \begin{equation*} \Sigma =S\left( x,y\right) +\lambda S\left( x^{\prime },\bar{y}\right), \end{equation*} whereas the mixed couples $$\left( x,\bar{y}\right)$$ and (x΄, y) would generate a surplus   \begin{equation*} \Sigma _{1}=S\left( x,\bar{y}\right) +\lambda S\left( x^{\prime },y\right). \end{equation*} Since   \begin{equation*} S\left( x,\bar{y}\right) -S\left( x,y\right) =S\left( x^{\prime },\bar{y}\right) -S\left( x^{\prime },y\right) >\lambda \left( S\left( x^{\prime },\bar{y}\right) -S\left( x^{\prime },y\right) \right), \end{equation*} we have that Σ1 > Σ; again, an open set of marriages satisfying this pattern would violate surplus maximization. The proof of the last statement is identical. Appendix C: Proof of Proposition 5 The first statement is a direct consequence of Proposition 3. Similarly, the last two statements directly follow from Lemma 1 above. To establish the second, we need to show that any mixed marriage can only involve a nonsmoking woman and a male smoker. Under the Male Prevalence assumption, this directly follows from the proof of Proposition 4. Appendix D: Recovering Individual Utilities in The Stable Match D.1. The Symmetric Case Assume Ms. x (a nonsmoker) marries Mr. y (also a nonsmoker) at the stable match; note that x = y by the previous Proposition. Let uN(x) (resp. vN(y)) denote her (his) utility. Then   \begin{equation*} u_{N}\left( x\right) +v_{N}\left( x\right) =S\left( x,x\right). \end{equation*} By symmetry,   \begin{eqnarray*} u_{N}\left( x\right) & =&\frac{S\left( x,x\right) }{2},v_{N}\left( y\right) =\frac{S\left( y,y\right) }{2} \quad {\rm and\,\, similarly}\\ u_{S}\left( x\right) & =&\lambda \frac{S\left( x,x\right) }{2} ,v_{S}\left( y\right) =\lambda \frac{S\left( y,y\right) }{2}. \end{eqnarray*} D.2. The General Case Consider for instance a nonsmoker wife with SES x. If her husband is a nonsmoker with SES ϕN(x), then stability implies that   \begin{equation*} u_{N}\left( x\right) =\max _{y}\left( S\left( x,y\right) -v_{N}\left( y\right) \right) \end{equation*} the maximum being reached for y = ϕ(x). It follows, from the envelope theorem, that   \begin{equation*} u_{N}^{\prime }\left( x\right) =\frac{\partial S}{\partial x}, \end{equation*} where the right-hand side derivative is taken at the point (x, ϕN(x)). Then,   \begin{equation*} u_{N}\left( x\right) =\int _{0}^{x}\frac{\partial }{\partial x}S\left( t,\phi _{N}\left( t\right) \right) dt+K, \end{equation*} where K is a constant; and a similar expression obtains for the other utilities. The various utilities are therefore defined up to an additive constant each; the constants, in turn, are pinned down by the adding up property on the support of μ and the indifference conditions. In particular, it becomes possible to compute the difference uN(x) − uS(x), which can be interpreted as the cost of smoking on the marriage market —or equivalently as the gain that would result from quitting—for a woman with SES x. D.3. The Constrained General Case We now compute individual utilities at the stable matching, as a function of SES, gender and smoking habit. Specifically, Proposition D.1. Under the MP assumption, individual utilities uN(x), uS(x), vN(y), vS(y) are all increasing. For all x and all y,   \begin{equation*} u_{N}\left( x\right) \ge u_{S}\left( x\right)\, {\quad\textit {and}}\quad\,v_{N}\left( y\right) \ge v_{S}\left( y\right) \end{equation*} Moreover, whenever pN(x) > 0, then   \begin{equation*} u_{N}\left( x\right) =u_{S}\left( x\right). \end{equation*} Lastly, for x and y large enough, the differences uN(x) − uS(x) and vN(y) − vS(y) are increasing (in x and y, respectively). $$\textit{Proof}$$. Assume Ms. x marries either Mr. y (a nonsmoker) or Mr. y΄. The first inequality comes from the fact that, for any given SES, a nonsmoker can only be either an equivalent or a better partner than a smoker. Next, if pN(x) > 0, both a smoking and a nonsmoking wives marry the same smoking husband with positive probability. Since the total surplus is the same in both cases, their utility must be the same. Lastly, first order conditions give that   \begin{eqnarray*} u_{N}^{\prime }\left( x\right) &=&\frac{\partial }{\partial x}S\left( x,\phi _{N}\left( x\right) \right) >0,\\ u_{S}^{\prime }\left( x\right) &=&\lambda \frac{\partial }{\partial x}S\left( x,\psi _{S}\left( x\right) \right) >0, \end{eqnarray*} so that   \begin{equation*} \left( u_{N}\left( x\right) -u_{S}\left( x\right) \right) ^{\prime }=\frac{\partial }{\partial x}S\left( x,\phi _{N}\left( x\right) \right) -\lambda \frac{\partial }{\partial x}S\left( x,\psi _{S}\left( x\right) \right). \end{equation*} For x large enough, both ϕN(x) and ψS(x) are close to 1, and that difference is positive. In short, utility increases with the person’s SES, and smokers are never better off than nonsmokers. However, whenever randomization takes place, a woman’s welfare does not depend on her smoking status: both smokers and nonsmokers marry a smoking husband with positive probability, with whom they generate the same surplus. For men, on the other hand, welfare is always smaller for smokers. Finally, at the top of the SES distribution, the cost of being a smoker increases with social status. In the uniform-quadratic example we present in the Supplementary Material, things are even simpler: the cost, for a woman, of being a smoker (as measured by the difference uN(x) − uS(x)) is zero below the randomization threshold and increases with SES above it; for men, it is always positive and always increasing. Footnotes 1 See for instance Coles and Francesconi (2013) for a search Nontransferable utility (NTU) approach, Chiappori, Oreffice, and Quintana-Domeque (2012) for a general investigation. 2 There is ample medical evidence that second-hand smoking has detrimental health effects on nonsmokers but not on smokers (ASH 2011; CDC 2006; Glymour et al. 2008; Mannino et al. 1997). In addition, the attitude of smokers toward smokers is much more permissive than that of nonsmokers (ASH 2011; Lader 2009; Pilkington et al. 2006). 3 Among multidimensional empirical works, Lindenlaub (2014) considers a specific model in which the distribution of types is normal and the payoff is quadratic to study multidimensional sorting between workers and jobs in the labor market, whereas Galichon and Salanié (2010) and Dupuy and Galichon (2014) use a Choo and Siow (2006) framework. For the analysis of dating patterns using a Gale–Shapley NTU approach, see Banerjee et al. (2013) and Hitsch, Hortaçsu, and Ariely (2010). 4 Taking the model to the data in a structural way is far beyond the scope of this paper. Extending the Choo and Siow (2006) methodology to a multidimensional setting with discrete and continuous characteristics is still an open question, despite recent and promising advances (Chiappori, Salanié, and Weiss 2014; Dupuy and Galichon 2014; Galichon and Salanié 2010). 5 In a pure Choo–Siow setting, this hypothesis would follow from a result by Graham (2011). 6 The theoretical analysis of matching under TU is typical in marriage market analysis, whereas models analyzing dating typically consider NTU. Matching under TU dates back to Koopmans and Beckmann (1957), Shapley and Shubik (1971), and Becker (1973). In particular, the last two contributions show that the stable matching maximizes aggregate surplus, and that the associated individual surpluses solve the dual imputation problem. In turn, the surplus maximization problem belongs to the class of optimal transportation problems, which date back to Monge (1781) and Kantorovich (1942); see Villani (2003) and McCann and Guillen (2010) for recent presentations. The precise connection between matching models and optimal transportation has been analyzed by Gretsky, Ostroy, and Zame (1999) in the discrete case, and by Gretsky, Ostroy, and Zame (1992), Ekeland (2010), and Chiappori, McCann, and Nesheim (2010) in the continuous one. 7 See for instance ASH (2011), CDC (2006), Glymour et al. (2008), Mannino et al. (1997), among others. Moreover, the attitude of smokers toward smokers is much more permissive than that of nonsmokers (ASH 2011; Lader 2009; Pilkington et al. 2006), suggesting that beyond the impact on life expectancy, the psychological costs of a smoking partner are large for nonsmokers but negligible for smokers. 8 A natural extension would allow the surplus to be decreased by a smaller amount when both spouses smoke than when only one is a nonsmoker; that is, the surplus would be discounted by λ < 1 if both are smokers, and by λ΄ < λ if only one is a smoker. We provide a brief discussion in Section 5. 9 See Chiappori et al. (2010) for a complete presentation. 10 Examples can readily be constructed by using disconnected subpopulations among both men and women. 11 Madrian and Lefgren (1999) illustrate and explain the matching procedures to longitudinally merge the CPS respondents. 12 Standard errors are computed using the delta method. The difference in the ratios is statistically significant at the 1% (p-value = 0.0000). 13 Assuming that each extra year of education an individual gets is worth about an 8% increment to their annual earnings. 14 Lundberg (2012) also finds that extraversion and neuroticism only matter for women, whereas conscientiousness only for men, in terms of marriage probabilities. References Angrist J., Krueger A. ( 1999) “ Empirical Strategies in Labor Economics.” In Handbook of Labor Economics , Vol. 3A, edited by Ashenfelter O., Card D.. Elsevier, Amsterdam. Google Scholar CrossRef Search ADS   ASH ( 2011). “ Secondhand Smoke.” Research Report . Banerjee A., Duflo E., Ghatak M., Lafortune J. ( 2013). “ Marry for What? Caste and Mate Selection in Modern India”. American Economic Journal: Microeconomics , 5, 33– 72. Google Scholar CrossRef Search ADS   Banks J., Kelly E., J.P. Smith ( 2013). “ Spousal Health Effects: The Role of Selection.” In Discoveries in the Economics of Aging  (NBER Book Series), edited by Wise D.A.. University of Chicago Press. Becker G. ( 1973). “ A Theory of Marriage: Part I”. Journal of Political Economy , 81, 813– 846. Google Scholar CrossRef Search ADS   Becker G. ( 1991). A Treatise on the Family . Harvard University Press. Caraballo R.S., Giovino G.A., Pechacek T.F., Mowery P.D. ( 2001). “ Factors Associated with Discrepancies between Self-Reports on Cigarette Smoking and Measured Serum Cotinine Levels among Person Aged 17 Years or Older: Third National Health and Nutrition Examination Survey, 1988-1994”. American Journal of Epidemiology , 153, 807– 814. Google Scholar CrossRef Search ADS PubMed  >CDC ( 2006). “ The Health Consequences of Involuntary Exposure to Tobacco Smoke: A Report of the Surgeon General.” U.S. Government Printing Office, Washington, DC. CDC ( 2010). “ Vital Signs: Current Cigarette Smoking Among Adults Aged ≥18 Years—United States, 2009,” Morbidity and Mortality Weekly Report, CDC , 59( 35), 1135– 1140. Available from: http://www.cdc.gov/mmwr/preview/mmwrhtml/mm5935a3.htm?s_cid=mm5935a3_w. Accessed on July 2016. Chiappori P.-A., Salanié B. ( 2016) “ The Econometrics of Matching Models.” Journal of Economic Literature , 54, 832– 861. Google Scholar CrossRef Search ADS   Chiappori P.-A., Iyigun M., Weiss Y. ( 2009). “ Investment in Schooling and the Marriage Market”. American Economic Review , 99, 1689– 1713. Google Scholar CrossRef Search ADS   Chiappori P.-A., McCann R., Nesheim L. ( 2010). “ Hedonic Price Equilibria, Stable Matching, and Optimal Tansport: Equivalence, Topology, and Uniqueness”. Economic Theory , 42, 317– 354. Google Scholar CrossRef Search ADS   Chiappori P.-A., Oreffice S., Quintana-Domeque C. ( 2012). “ Fatter Attraction: Anthropometric and Socioeconomic Matching on the Marriage Market”. Journal of Political Economy , 120, 659– 695. Google Scholar CrossRef Search ADS   Chiappori P.-A., Oreffice S., Quintana-Domeque C. ( 2016). “ Black-White Marital Matching: Race, Anthropometrics, and Socioeconomics”. Journal of Demographic Economics , 82, 399– 421. Google Scholar CrossRef Search ADS   Chiappori P.-A., Salanié B., Weiss Y. ( 2014). “ Partner Choice and the Marital College Premium.” Working paper , Columbia University. Choo E., Siow A. ( 2006) “ Who Marries Whom and Why.” Journal of Political Economy , 114, 172– 201. Google Scholar CrossRef Search ADS   Clark A., Etilé F. ( 2006). “ Don’t Give Up on Me Baby: Spousal Correlation in Smoking Behaviour”. Journal of Health Economics , 25, 958– 978. Google Scholar CrossRef Search ADS PubMed  Coles M., M. Francesconi ( 2013). “ Equilibrium Search and the Impact of Equal Opportunities for Women.” Economics Discussion Papers 742 , University of Essex, Department of Economics. Delnevo C.D., Bauer U.E. ( 2009). “ Monitoring the Tobacco Use Epidemic III: The Host: Data Sources and Methodological Challenges”. Preventive Medicine , 48, S16– S23. Google Scholar CrossRef Search ADS PubMed  Dupuy A., Galichon A. ( 2014). “ Personality Traits and the Marriage Market”. Journal of Political Economy , 122, 1271– 1319. Google Scholar CrossRef Search ADS   Ekeland I. ( 2010). “ Existence, Uniqueness and Efficiency of Equilibrium in Hedonic Markets with Multidimensional Types”. Economic Theory , 42, 275– 315. Google Scholar CrossRef Search ADS   Fox J. ( 2010). “ Identification in Matching Games”. Quantitative Economics , 1, 203– 254. Google Scholar CrossRef Search ADS   Galichon A., Salanié B. ( 2010). “ Cupid’s Invisible Hand: Social Surplus and Identification in Matching Models.” Discussion Paper: 0910–0914 , Columbia University. Google Scholar CrossRef Search ADS   Glymour M., T. DeFries, I. Kawachi, M. Avendano ( 2008)“ Spousal Smoking and Incidence of First Stroke The Health and Retirement Study.” American Journal of Preventive Medicine , 35, 245– 248. Google Scholar CrossRef Search ADS PubMed  Graham B. ( 2011). “ Econometric Methods for the Analysis of Assignment Problems in the Presence of Complementarity and Social Spillovers.” In Handbook of Social Economics , Vol. 1B, edited by Benhabib J., Jackson M., Bisin A.. North-Holland, pp. 965– 1052. Google Scholar CrossRef Search ADS   Gretsky N., Ostroy J., Zame W. ( 1999). “ Perfect Competition in the Continuous Assignment Model”. Journal of Economic Theory , 88, 60– 118. Google Scholar CrossRef Search ADS   Gretsky N., Ostroy J., Zame W. ( 1992). “ The Nonatomic Assignment Model.” Economic Theory 2 , 103– 127. Gruber J. ( 2001). “ Tobacco at the Crossroads: The Past and Future of Smoking Regulation in the United States”. Journal of Economic Perspectives , 15( 2), 193– 212. Google Scholar CrossRef Search ADS   Hitsch G., Hortaçsu A., Ariely D. ( 2010). “ Matching and Sorting in Online Dating”. American Economic Review , 100( 1), 130– 163. Google Scholar CrossRef Search ADS   Kantorovich L. ( 1942). “ On the Translocation of Masses,” Dokl. Akad. Nauk SSSR , 37, 227– 229. Koopmans T.C., Beckmann M. ( 1957). “ Assignment Problems and the Location of Economic Activities”. Econometrica , 25, 53– 76. Google Scholar CrossRef Search ADS   Lader D. ( 2009). “ Smoking-Related Behaviour and Attitudes, 2008/09.” Opinions Survey Report 40 . Lam D. ( 1988). “ Marriage Markets and Assortative Mating with Household Public Goods: Theoretical Results and Empirical Implications”. Journal of Human Resources , 23( 4), 462– 487. Google Scholar CrossRef Search ADS   Lundberg S. ( 2012). “ Personality and Marital Surplus.” IZA Journal of Labor Economics , 1. Madrian B., Lefgren L. ( 1999) “ A Note on Longitudinally Matching CPS Respondents.” NBER Technical Working Paper #247.  NBER, Cambridge, MA. Mannino D., Siegel M., Rose D., Nkuchia J., Etzel R. ( 1997). “ Environmental Tobacco Smoke Exposure in the Home and Worksite and Health Effects in Adults: Results from the 1991 National Health Interview Survey”. Tobacco Control , 6, 296– 305. Google Scholar CrossRef Search ADS PubMed  Maralani V. ( 2009). “ An Unequal Start: The Alignment of Education and Smoking in Families of Origin.” Working paper , Yale University. Mare R. ( 2008). “ Educational Assortative Mating in Two Generations.” Working paper , University of California at Los Angeles. McCann R., Guillen N. ( 2010). “ Five Lectures on Optimal Transportation: Geometry, Regularity and Applications.” University of Toronto. Google Scholar CrossRef Search ADS   Mills A., Messer K., Gilpin L., Pierce G. ( 2009). “ The Effect of Smoke-Free Homes on Adult Smoking Behavior: A Review”. Nicotine Tobacco Research , 11, 1131– 1141. Google Scholar CrossRef Search ADS PubMed  Monge G. ( 1781). “ Mémoire sur la théorie des déblais et de remblais.” Histoire de l’Académie Royale des Sciences de Paris, avec les Mémoires de Mathématique et de Physique pour la même année . De l'Imprimerie Royale, Paris, pp. 666– 704. NCHS. ( 2010). National Center for Health Statistics. Health, United States, 2009: With Special Feature on Medical Technology . Hyattsville, MD. Oreffice S., Quintana-Domeque C. ( 2010). “ Anthropometry and Socioeconomics Among Couples: Evidence in the United States”. Economics and Human Biology , 8( 3), 373– 384. Google Scholar CrossRef Search ADS PubMed  Pencavel J. ( 1998). “ Assortative Mating by Schooling and the Work Behavior of Wives and Husbands”. American Economic Review Papers and Proceedings, 88, 326– 329. Pilkington P., Gray S., Gilmore A., Daykin N. ( 2006). “ Attitudes Towards Second Hand Smoke amongst a Highly Exposed Workforce: Survey of London Casino Workers”. Journal of Public Health , 28, 104– 110. Google Scholar CrossRef Search ADS PubMed  Qian Z. ( 1998). “ Changes in Assortative Mating: The Impact of Age and Education, 1970–1990” Demography , 35, 279– 292. Google Scholar CrossRef Search ADS PubMed  Shapley L.S., Shubik M. ( 1971). “ The Assignment Game I: The Core”. International Journal of Game Theory , 1, 111– 130. Google Scholar CrossRef Search ADS   Shimer R., Smith L. ( 2000). “ Assortative Matching and Search,” Econometrica , 68, 343– 369. Google Scholar CrossRef Search ADS   Silventoinen K., Kaprio J., Lahelma E., Viken R.J., Rose R.J. ( 2003). “ Assortative Mating by Body Height and BMI; Finnish Twins and their Spouses”. American Journal of Human Biology , 15, 620– 627. Google Scholar CrossRef Search ADS PubMed  Surgeon General’s Report ( 2001). “ Women and Smoking: A Report from the Surgeon General.” US Department of Health and Human Services . Surgeon General’s Report ( 2012). “ Preventing Tobacco Use Among Youth and Young Adults.” US Department of Health and Human Services . Sutton G. ( 1980). “ Assortative Marriage for Smoking Habits”. Annals of Human Biology , 7, 449– 456. Google Scholar CrossRef Search ADS PubMed  Sutton G. ( 1993). “ Do Men Grow to Resemble their Wives or Vice Versa?” Journal of Biological Science , 25, 25– 29. Venters M., Jacobs D., Luepker R., Maimaw L., Gillum R. ( 1984). “ Spouse Concordance of Smoking Patterns: The Minnesota Heart Survey”. American Journal of Epidemiology , 120, 608– 616. Google Scholar CrossRef Search ADS PubMed  Villani C. ( 2003). “ Topics in Optimal Transportation.” Graduate Studies in Mathematics , Vol. 58. American Mathematical Society, Providence, RI. Google Scholar CrossRef Search ADS   Weiss Y., Willis R. ( 1997). “ Match Quality, New Information, and Marital Dissolution”. Journal of Labor Economics , 15, 293– 329. Google Scholar CrossRef Search ADS   Wooldridge J. ( 2002). Econometric Analysis of Cross Section and Panel Data . MIT Press, Cambridge, MA. Öberg M., Jaakkola M., Woodward A., Peruga A., Prüss-Ustün A. ( 2011). “ Worldwide Burden of Disease from Exposure to Second-Hand Smoke: A Retrospective Analysis of Data from 192 Countries”. Lancet , 377, 139– 146. Google Scholar CrossRef Search ADS PubMed  © The Authors 2017. Published by Oxford University Press on behalf of European Economic Association. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of the European Economic Association Oxford University Press

# Bidimensional Matching with Heterogeneous Preferences: Education and Smoking in the Marriage Market

, Volume 16 (1) – Feb 1, 2018
38 pages

/lp/ou_press/bidimensional-matching-with-heterogeneous-preferences-education-and-Z45rZIpxHK
Publisher
European Economic Association
ISSN
1542-4766
eISSN
1542-4774
D.O.I.
10.1093/jeea/jvx012
Publisher site
See Article on Publisher Site

### Abstract

Abstract We develop a frictionless matching model under transferable utility where individuals are characterized by a continuous trait and a binary attribute. The model incorporates attributes for which there are heterogeneous preferences in the population regarding their desirability, that is, the impact of the traits cannot be summarized by a one-dimensional attractiveness index. We present a general resolution strategy based on optimal control theory, and characterize the stable matching. We then consider education and smoking status, further specify the model by observing that there are more male than female smokers above each education level, and derive additional predictions about equilibrium matching patterns and how individuals with different smoking habits “marry down” or “marry up” by education. Using the CPS March and Tobacco Use Supplements for the period 1996–2003, we find that the hypotheses based on our model predictions are borne out in the data. 1. Introduction Empirical evidence strongly suggests that matching processes in the marriage market are multidimensional. Spouses tend to be similar in a variety of characteristics, including age, body mass index, education, race, religion, and smoking status (Becker 1991; Oreffice and Quintana-Domeque 2010; Qian 1998; Silventoinen et al. 2003; Sutton 1980; Weiss and Willis 1997). New developments in the matching literature have started to address the multidimensional nature of attractiveness by considering settings with two or more attributes. In particular, several recent studies1 consider frameworks where multiple characteristics can be summarized by a single, one-dimensional attractiveness index, so that, technically, the matching process is de facto one-dimensional. Such a single-index approach is a powerful tool, and it allows to apply the standard one-dimensional matching techniques to settings with multiple attributes. However, its validity relies on a strong homogeneity assumption: it must be the case that the trade-offs between individual female traits are identically perceived by all men (and similarly for male characteristics). This approach, appealing as it may seem, is not always appropriate. When preferences for any of the relevant matching characteristics are heterogeneous within either the male or the female populations (or both), matching becomes multidimensional, as long as the traits at stake are not perfectly correlated. That is, the corresponding traits cannot be collapsed into a single index, and one-dimensional matching techniques are not well-suited to characterize the corresponding stable matches. In the marriage market, several characteristics can be heterogeneously assessed by potential mates, including race, ethnicity, age, health attributes, or smoking status. For instance, we know that smokers are more likely to marry smokers (Clark and Etilé 2006; Maralani 2009; Sutton 1980; Venters et al. 1984), and that a large medical and public health literature essentially shows that nonsmokers mind their partner’s smoking status, whereas smokers do not.2 The goal of this paper is to extend the standard matching model under Transferable Utility to a particular case of multidimensional matching, where one of the characteristics is assessed heterogeneously by potential mates. In particular, we present a general resolution strategy and characterize the stable matching when individuals differ in two characteristics.3 We also suggest a new way of testing matching models. In most one-dimensional matching models, theoretical implications regarding who marries whom are straightforward; essentially, they boil down to supermodularity implying assortative matching. In multidimensional contexts, however, things are much more complex, since theory may imply specific properties on matching patterns along the various dimensions under consideration. We argue that such predictions may be taken to the data and tested in reduced form. Specifically, in this paper we study a parsimonious model and derive a series of predictions. We then test hypotheses based on these predictions, and find they are indeed supported by the data. Although our hypotheses tests can be formally derived from several specific stochastic structures found in the literature, we argue that they do not require explicit assumptions on the stochastic structure of unobservables; therefore, and in the same spirit as previous tests of index-based models (e.g., Chiappori, Oreffice, and Quintana-Domeque 2012), such tests can be viewed as a useful complement to more structural approaches. In our model, individuals are characterized by two dimensions. One characteristic is continuous, and can be interpreted as an index of socioeconomic status (from now on SES), reflecting differences in education, income, social prestige, and others, or any combination of those. The other characteristic is discrete, and more precisely dichotomic. We suggest to interpret the second characteristic as the individual’s smoking status. The marital surplus function is assumed to be differentiable and supermodular in the continuous indices, as is common in the literature, and to be multiplicatively impacted by the discrete characteristic: it may be diminished by the presence of a smoker in the couple. As long as smoking and socioeconomic status are not perfectly correlated (and they are not), this heterogeneity in preferences is a key feature of our setting: it rules out a single-index representation, since the trade-off between the two characteristics is perceived differently among potential spouses, which is precisely what index models forbid. We first analyze, as a benchmark, a fully symmetric version of the model, in which male and female characteristics play the same role in the surplus function and are identically distributed. We show that the resulting stable matching exhibits full segregation, in the sense that smokers exclusively marry smokers. This result is interesting per se, since it does not depend on the magnitude of the negative impact of smoking. In our “pure” frictionless framework, therefore, even minor differences in preferences (or surplus) may result in large-scale segregation; moreover, the latter is efficient, in the sense that it does maximize total welfare. We then consider the general asymmetric case. We prove existence and generic uniqueness of the stable matching and present a general resolution strategy, which can be used in any discrete or continuous multidimensional framework. Even in our simple framework, a closed-form characterization of the stable matching cannot be obtained in the general case; however, we derive general predictions on the form that the stable matching may take. The analysis of a specific, quadratic case in which a closed form solution exists is provided in our Supplementary Material. Next, we further specify the model by assuming that, for any SES level, there are more male smokers than female smokers among individuals above that SES level, a pattern that we actually observe in the data. This generates a set of additional predictions regarding the nature of the stable matching. First, there are no “mixed” couples in which she smokes and he does not, whereas the opposite pattern—he smokes and she does not—happens with positive probability. Second, smoking husbands married to smokers are of higher “quality” (i.e., higher SES) than those married to nonsmokers: smoking “premium” for smoking wives. Third, and conversely, nonsmoking wives married to smoking husbands have a lower SES than those married to nonsmokers: smoking “penalty” for smoking husbands. Fourth, there is positive assortative matching on SES among couples with identical smoking habits. Fifth, positive assortative matching on SES by smoking status is stronger at the top of the SES distribution: smoking husbands (wives) at the top of the SES distribution are more likely to marry smoking wives (husbands) than those with lower SES. Similarly, nonsmoking husbands (wives) at the top of the SES distribution are more likely to marry nonsmoking wives (husbands) than those with lower SES. Finally, some female nonsmokers may marry either a smoker or a nonsmoker with positive probability; then it must be the case that the smoking husband has a higher SES. Similarly, some male smokers may marry either a smoker or a nonsmoker; but then both women have the same SES. We use the Current Population Survey March Supplements data combined with the Tobacco Use Supplements (TUS) for the period 1996–2003. These TUS supplements, which are widely used in medical research on smoking, provide the largest representative sample of the US population, and, crucially, allow us to retrieve information on both spouses. We follow a reduced-form approach to test the hypotheses based on the theoretical predictions on the stable matching derived from our deterministic model, acknowledging that our general model could not be solved under a general stochastic structure. Our approach is compatible with a Choo–Siow specification, that is, a framework where unobserved heterogeneity can be fully captured by an additive random term, which is moreover the sum of two random variables with type 1 extreme value distributions; but it could also be used in alternative settings.4 In our deterministic model, stability conditions imply that there should be no “mixed” couples in which she smokes and he does not, whereas the opposite pattern—he smokes and she does not—should happen with positive probability (the asymmetry between genders being due to the larger prevalence of male smoking). In a more complex setting, the presence of either search frictions (à la Shimer and Smith 2000) or unobserved matching characteristics (or both) typically results in positive probabilities for all possible couples. Still, a natural hypothesis is that mixed couples where the wife smokes should be much less frequent than vice versa; that is, the ratio of the two subpopulation sizes should be significantly lower than what would be implied by the sole difference in relative smoking prevalence.5 To test this hypothesis, we compare the actual subpopulations ratio to what it should be under independence for each of the nine US Geographical Census divisions; interestingly, we find that the hypothesis is satisfied in each of them, despite significantly different smoking prevalences across regions. This is consistent with our first prediction. Similarly, our regression analysis shows that among smoking husbands, those who marry smoking wives exhibit (on average) 0.15 more years of completed education (or about a 1.2% higher annual earnings) than those with nonsmoking wives, consistent with our second prediction. Conversely, our evidence reveals that among nonsmoking wives, those with smoking husbands exhibit (on average) 0.11–0.13 fewer years of completed education than those with nonsmoking husbands, supporting our third prediction. Consistent with our fourth prediction, we also find positive assortative matching on education for each type of couple, and in particular for couples with identical smoking habits. In addition, at the top of the SES distribution, positive assortative matching on education by smoking status is stronger, supporting our fifth prediction. Finally, the well-known negative correlation between education and smoking is confirmed in our data, and we estimate that for men this correlation becomes less negative if one further controls for the wife’s education, whereas this pattern does not appear for women; we argue that this fact is in line with our last prediction. Perhaps the most informative prediction of our bidimensional matching model is the one stating that smoking husbands married to smokers have higher SES than those married to nonsmokers. Such a prediction is typical of a matching logic, in which female smokers eventually benefit from being on the short side of the market. A potential concern here is that such empirical patterns (even when controlling for additional covariates), instead of being a consequence of matching mechanisms on the marriage market, may reflect the fact that smoking behavior is endogenous to marital status. However, we doubt this can explain our findings. Using information on engaged couples, newlyweds, and couples married for over 5 years, Sutton (1993) finds that similarities in smoking status were already present about the time of marriage. More recently, Banks, Kelly, and Smith (2013), who use retrospective evidence from the Health and Retirement Study, show that most smoking behavior is initiated before marriage, so that smoking is predetermined with respect to marital status. In addition, we use a simple and popular estimator in program evaluation (see Wooldridge 2002) obtained from an OLS regression of education on spousal smoking status controlling for the predicted probability that the spouse is a smoker (and its square). The key variable(s) used in predicting such a probability is (are) the age at which the spouse started smoking regularly (and its square). Reassuringly, the estimates adjusted by the spousal propensity to smoke give a similar picture to our main estimates. Finally, we present two refutability tests in our Supplementary Material: one based on randomly sorting married men and women to couples, which shows that the predictions of our model are rejected; another based on testing a single-index model in education and smoking, which is rejected by the data. The next section contains the model. Section 3 describes the data. Section 4 tests the hypotheses based on the theoretical predictions of our model. Section 5 presents a brief discussion, whereas Section 6 concludes. 2. The Model We present a parsimonious preference model given the two types of characteristics at stake, defining the surplus structure and the corresponding stable matches under Transferable Utility (from now on TU).6 After characterizing the equilibrium under a benchmark “symmetric” case (where male and female characteristics play the same role in the surplus function and are identically distributed), we analyze the general case deriving predictions on the form that the stable matching may take, which can be used in any discrete or continuous multidimensional framework, although a closed-form characterization of the stable matching cannot be obtained. Finally, we introduce the male prevalence condition into the model (for any SES level, there are more male smokers than female smokers among individuals above that SES level), and derive several predictions that will be the basis of the hypotheses in our empirical analysis. 2.1. The Basic Framework 2.1.1. Populations We consider two populations (men and women) of equal size, normalized to one. Agents differ in two respects. First, they are characterized by a continuous index; one may, without loss of generality, assume that this index is uniformly distributed over the interval [0, 1]. A possible interpretation is in terms of socioeconomic status (SES); then the index depends on the agent’s income, education, prestige, or any combination of those. Second, agents are also characterized by some dichotomous indicator taking values in the set {N, S}; in our empirical application, S stands for “smoker” and N for “nonsmoker”, although alternative interpretations are possible. An agent is thus formally characterized by a pair (x, X) if female and (y, Y) if male, where x or y ∈ [0, 1] is the agent’s continuous index (i.e., SES), and X, Y ∈ {N, S} defines the agent’s discrete characteristic (i.e., smoking status). Let F (resp. G) denote the cumulative distribution of female (male) characteristics (x, X) ((y, Y)) over the set [0, 1] × {N, S}, and let FX(x) (GY(y)) denote the number of females (males) with smoking status X (Y, with X, Y ∈ {N, S}) and SES no larger than x (y). In particular, FX(1) and GY(1), respectively, denote the total number of females and males with smoking habits X and Y. 2.1.2. Surplus In any married couple, the sum of individual utilities is given by some function of the partners' characteristics; as it is customary, we define the surplus generated by marriage as the difference between this function and the sum of utility levels that each spouse would reach as single. In our framework, the surplus depends on both the discrete and the continuous characteristics of each partner. We assume that the surplus Σ generated by a match between (x, X) and (y, Y) has the form   \begin{eqnarray*} \Sigma \left( \left( x,X\right) ,\left( y,Y\right) \right) = \left\lbrace \begin{array}{l@{\quad}l}S\left(x,y\right) & {\rm if}\, {X=Y=N},\\ \lambda S\left( x,y\right) & {\rm otherwise}. \end{array}\right. \end{eqnarray*} The function S is strictly increasing, continuously differentiable and supermodular; moreover, λ < 1. This means that the impact on marital surplus of the spouses’ smoking habits is fully summarized by a single parameter λ, which represents the decrease in surplus due to the presence of (at least) a smoker in the couple. Hence, in our framework, the surplus of a mixed (smoker, nonsmoker) couple is the same as that of a couple of smokers, but strictly less than of a nonsmoking pair. Restrictive as it may seem, we believe that the common λ assumption is a reasonable approximation. The multiplicative nature of the impact of λ reflects the fact that smoking, by reducing life expectancy, decreases the present discounted value of future welfare proportionally. We believe that this is more realistic than an additive parameter, as the latter would imply that those couples with lower socioeconomic resources (and surplus) would be disproportionately penalized by having (at least) a smoker in the couple. In addition, a large medical and public health literature7 shows that a smoking partner decreases life expectancy of a nonsmoker but not of a smoker, which is compatible with our assumption.8 Last but not least, although the single parameter λ assumption cannot be directly tested (since the marital surplus is not observed), one can evaluate its plausibility using auxiliary conditions of the Choo–Siow type, such as unobserved heterogeneity of type 1 extreme value distribution form. We perform this exercise and find empirical support for it (see Table E.1 in the Supplementary Material). 2.2. Stable Matching 2.2.1. Definition A matching is defined as a measure μ on the set ([0, 1] × {N, S})2 and four functions uN(x), uS(x), vN(y) and vS(y). Intuitively, for two sets A, B ⊂ [0, 1] × {N, S}, μ[A, B] denotes the probability that a woman belonging to A is married with a man belonging to B; and for any female (x, X) (male (y, Y), with X, Y ∈ {N, S}), uX(x) (vY(y)) is the utility she (he) receives at a stable matching. A constraint on μ is that its marginal should equal the initial distributions of individuals; that is, the marginal on the set of females (males) is F (G). In addition, on the support of μ, individual utilities satisfy   \begin{equation*} u_{X}\left( x\right) +v_{Y}\left( y\right) =\Sigma \left( \left( x,X\right) ,\left( y,Y\right) \right) ,\forall \left( \left( x,X\right) ,\left( y,Y\right) \right) \in \mathit{Supp}\left( \mu \right) , \end{equation*} reflecting the fact that if two agents may marry with positive probability, their individual utilities must add up to the surplus they generate when married. A matching is stable if no matched agent would be better off unmatched, and if no two individuals would prefer being matched together to their current situation. Normalizing singles’ utility to zero, stability can be summarized by the following set of inequalities: for any (x, X), (y, Y) we have that   $$u_{X}\left( x\right) \ge 0,\quad v_{Y}\left( y\right) \ge 0 \quad {\rm and} \quad u_{X}\left( x\right) +v_{ Y }\left( y\right) \ge \Sigma \left( \left( x,X\right) ,\left( y,Y\right) \right),$$ (1)therefore   $$u_{X}\left( x\right) +v_{ Y }\left( y\right) \ge \left\lbrace \begin{array}{@{}l@{\quad }l@{}}S\left( x,y\right) & {\rm if} \, {X= Y =N},\\ \lambda S\left( x,y\right) & {\rm otherwise}, \end{array}\right.$$ (2)where an equality obtains on the support of μ. The first constraints in (1) reflect the requirement that married people should prefer marriage to singlehood; the second constraint in (1) expresses that any two individuals cannot, by forming a new match, strictly increase their current utilities. 2.2.2. Existence Existence of a stable match stems from general results, which state that in a TU context, the minimization of aggregate utility over the set of stable matches is equivalent to the maximization of aggregate surplus over all possible assignments.9 Formally, if (μ, uN(x), uS(x), vN(y), vS(y)) is a stable matching, then the measure μ solves   $$\max _{\nu \in \mathcal {M}}\int \Sigma \left( \left( x,X\right) ,\left( y,Y\right) \right) d\nu \left( \left( x,X\right) ,\left( y,Y\right) \right),$$ (3)where $$\mathcal {M}$$ denotes the set of measures on the set ([0, 1] × {N, S})2 whose marginal distributions coincide with the initial measures F and G on the female and male populations, respectively. Since this set is compact and Σ is continuous in x and y, a solution exists. Conversely, for any solution $$\bar{\mu }$$ to the surplus maximization problem, consider the dual program   \begin{eqnarray*} &&{\min _{u_{N},u_{S},v_{N},v_{S}}\int _{\left[ 0,1\right] \times \left\lbrace N,S\right\rbrace }\left( \boldsymbol{1}\left[ X=S\right] u_{S}\left( x\right) +\boldsymbol{1}\left[ X=N\right] u_{N}\left( x\right) \right) dF\left( x,X\right)} \\ &&\quad {} +\int _{\left[ 0,1\right] \times \left\lbrace N,S\right\rbrace }\left( \boldsymbol{1}\left[ Y=S\right] v_{S}\left( y\right) +\boldsymbol{1} \left[ Y=N\right] v_{N}\left( y\right) \right) dG\left( y,Y\right) \end{eqnarray*} under the constraints in (1). If $$\left( \bar{u}_{N},\bar{u} _{S},\bar{v}_{N},\bar{v}_{S}\right)$$ denotes a solution, then $$\left( \bar{\mu },\bar{u}_{N},\bar{u}_{S},\bar{v}_{N},\bar{v}_{S}\right)$$ defines a stable matching. Let us note that there may exist matches involving “mixed strategies”, whereby an open set of agents are each matched to several agents with positive probabilities. In our setting, an individual may be indifferent between two types of mates. For instance, if a woman with SES x0 is a nonsmoker, the partial of the surplus with respect to x is ∂S(x0, y1)/∂x if she marries a nonsmoker with SES y1, and λ∂S(x0, y2)/∂x if she is matched with a smoker with SES y2. Although S is strictly supermodular, we may still have that   \begin{equation*} \frac{\partial S}{\partial x}\left( x_{0},y_{1}\right) =\lambda \frac{\partial S}{\partial x}\left( x_{0},y_{2}\right) \end{equation*} with y2 > y1 since λ < 1. 2.3. The Symmetric Case As a benchmark, we consider a setting that is totally symmetric between genders. This implies that (i) male and female characteristics play the same role in the surplus function (S(x, y) = S(y, x)) and (ii) the distributions of characteristics are identical for men and women (F = G). Then, the stable matching can easily be characterized. Proposition 1. Under the symmetry assumptions (i) and (ii) above, there exists a unique stable matching, which is completely assortative. Smokers only marry smokers, and nonsmokers only marry nonsmokers; moreover, in each couple, spouses have the same SES. $$\textit{Proof}$$. It is well known that a matching is stable if and only if the corresponding measure maximizes total surplus. In this context, positive assortativeness within each smoking category directly follows from supermodularity. We simply need to show that it cannot be the case that an open set of nonsmokers of one gender marry smokers. Assume it is; then an equal measure open set of nonsmokers of the opposite gender also marry smokers. But since λ < 1, the total surplus is then less than in the completely assortative matching, a contradiction. Although the symmetric case is obviously very specific, it constitutes an interesting benchmark. A first lesson that can be drawn from it is that bidimensional matching of the type under consideration naturally leads to segregated outcomes. In the symmetric context, even if the loss incurred when a nonsmoker marries a smoker is very small (i.e., λ is very close to one), the marriage patterns exhibit complete segregation, in the sense that smokers exclusively marry smokers and nonsmokers exclusively marry nonsmokers. In other words, minor differences in preferences may have a spectacular impact on marital patterns, particularly in terms of segregation. In addition, one can easily compute each spouse’s utility at a stable matching (see Appendix). The preceding conclusions, however, heavily rely on the very specific features of the symmetric framework. In particular, no trade-off exists between the two characteristics at the stable matching: there is no point, for a nonsmoker, in considering a smoker as a spouse, since a nonsmoker with exactly the same SES is always available at equilibrium. 2.4. The General Case We now consider the general case. In what follows, let pN(x) be the probability that a nonsmoking woman with SES x marries a smoker (then 1 − pN(x) is the probability she marries a nonsmoker). We define similarly pS(x΄), qN(y) and qS(y΄) as the probability of marrying a smoker for a female smoker, a male nonsmoker and a male smoker, respectively. These probabilities are endogenous, and are determined by the equilibrium (stability) conditions. 2.4.1. Qualitative Results In this subsection, we provide some qualitative properties of the equilibrium, which hold true irrespective of the exact distribution of smokers and nonsmokers in the population and the exact form of the (supermodular) surplus. A first result expresses the fact that, at a stable matching, matching is positive assortative on SES among couples with identical smoking habits. Proposition 2. Consider two matched couples, (x, X), (y, Y) and (x΄, X), (y΄, Y) with identical smoking status. For almost all such couples, x ≥ x΄ if and only if y ≥ y΄. $$\textit{Proof}$$. Assume, for instance, that x ≥ x΄ but y < y΄ on a subset of positive measure. The surplus generated by any two such couples is   \begin{equation*} \Sigma _{1}=\Sigma \left( \left( x,X\right) ,\left( y,Y\right) \right) +\Sigma \left( \left( x^{\prime },X\right) ,\left( y^{\prime },Y\right) \right) \end{equation*} whereas the matching (x, X), (y΄, Y) and (x΄, X), (y, Y) would generate a surplus   \begin{equation*} \Sigma _{2}=\Sigma \left( \left( x,X\right) ,\left( y^{\prime },Y\right) \right) +\Sigma \left( \left( x^{\prime },X\right) ,\left( y,Y\right) \right) >\Sigma _{1} \end{equation*} by strict supermodularity of Σ in (x, y). Should this situation exist for a non-null set of couples, the matching would not maximize total surplus, a contradiction. The second result states that among individuals with high SES, matching is assortative on both SES and smoking status. Proposition 3. Assume that the upper bound of the support of the measures FS, FN, GS, and GN is 1. Then there exist thresholds xN, xS, yN, and yS in [0, 1) such that for almost all xN ≤ x ≤ 1, xS ≤ x΄ ≤ 1, yN ≤ y ≤ 1, and yS ≤ y΄ ≤ 1,   \begin{equation*} p_{N}\left( x\right) =q_{N}\left( y\right) =0\,\quad{\textit {and}}\quad\,p_{S}\left( x^{\prime }\right) =q_{S}\left( y^{\prime }\right) =1. \end{equation*} $$\textit{Proof}$$. See Appendix Proposition 3 states that nonsmokers with high enough SES marry nonsmokers with probability 1: marrying a smoker would decrease the surplus by a factor λ, which for high SES can only decrease total surplus. Similarly, smokers with high enough SES only marry smokers. However, the previous result is only true at the top of the SES distribution; further down, randomization may appear at a stable matching. To see why, assume that the distributions of the male and female populations are such that men are much more likely to smoke than women. The assortative pattern described in Proposition 2, together with the measure restrictions, implies that both nonsmoking wives and smoking husbands, being on the long side of the market, will have to “marry down” (i.e., a spouse with relatively lower SES). At some point, the marginal nonsmoking wife may become indifferent between marrying a nonsmoker or the marginal smoking husband, because the resulting loss in total surplus (due to λ < 1) is exactly offset by the higher SES of the latter. Note that, in particular, male smokers who match with female nonsmokers must have a higher socioeconomic status than their wife; indeed, the only reason why a female nonsmoker would agree to marry a smoker (thus accepting a shrinkage of total surplus) is the shortage of male nonsmokers. This particular pattern is not directly linked to the multiplicative nature of the impact of smoking; as soon as marrying a smoker involves a loss in surplus, we expect similar patterns. What is specific of the multiplicative setting, however, is that these phenomena do not take place at the top of the distribution—where the loss is larger. This same logic also suggests, however, that in a given neighborhood, all forms of randomization are not simultaneously possible. In our example, for instance, although female nonsmokers may want to marry a smoking spouse, male nonsmokers would not, since they would lose a share λ of the surplus and the opportunity to “marry up”. This feature is indeed general and is expressed by the following result. Proposition 4. Assume there exists an open set O such that, for all x ∈ O, 0 < pN(x) < 1—so that x marries either a nonsmoker y or a smoker y΄ with positive probability. Assume moreover that qS(y΄) > 0—so that y΄also marries a smoker x΄ with positive probability. Then qN(y) = 0 and pS(x΄) = 1 almost surely. Moreover, x΄ = x and y΄ > y. Similarly, if for all y in some open set O΄, 0 < qN(y) < 1—so that y marries either a nonsmoker x or a smoker x΄ with positive probability– and pS(x΄) > 0—so that x΄ also marries a smoker y΄ with positive probability, then pN(x) = 0 and qS(y΄) = 1 almost surely. Moreover, x΄ > x and y΄ = y. $$\textit{Proof}$$. See Appendix Proposition 4 states that the various types of randomizations are mutually exclusive. In the neighborhood of some given SES, it may be the case that female nonsmokers and male smokers intermarry with positive probability; but then, in this same neighborhood, female smokers and male nonsmokers only marry their own type. Of course, the pattern may be opposite in a different neighborhood10; ultimately, the matching patterns depend on the distributions F and G. 2.4.2. Surplus Maximization We now describe the mathematical form of the problem, and indicate a resolution strategy that works for arbitrary distributions. Start with the constraint that the marginals of the stable measure μ must coincide with the female and male population measures. Proposition 2 greatly simplifies the expression of these constraints, as it is demonstrated in the following points. Consider a nonsmoker female with SES x, matched with a nonsmoker male with SES y = ϕN(x). Then the number of nonsmoker females with SES higher than x, married with a nonsmoker, must equal the number of nonsmoker males with SES higher than y, married with a nonsmoker,   $$\int _{x}^{1}\left( 1-p_{N}\left( t\right) \right) dF_{N}\left( t\right) =\int _{\phi _{N}\left( x\right) }^{1}\left( 1-q_{N}\left( t\right) \right) dG_{N}\left( t\right),$$ (4)which defines the function ϕN. Similarly, for a nonsmoker female with SES x marrying a smoker with SES y΄ = ψN(x),   $$\int _{x}^{1}p_{N}\left( t\right) dF_{N}\left( t\right) =\int _{\psi _{N}\left( x\right) }^{1}\left( 1-q_{S}\left( t\right) \right) dG_{S}\left( t\right),$$ (5)which defines ψN. For the other combinations of smoking status   \begin{align} \int _{x}^{1}\left( 1-p_{S}\left( t\right) \right) dF_{S}\left( t\right) =\int _{\phi _{S}\left( x\right) }^{1}q_{N}\left( t\right) dG_{N}\left( t\right), \end{align} (6)  \begin{align} \int _{x}^{1}p_{S}\left( t\right) dF_{S}\left( t\right) =\int _{\psi _{S}\left( x\right) }^{1}q_{S}\left( t\right) dG_{S}\left( t\right). \end{align} (7) In particular, there exists a one-to-one relationship between the four probability functions (pN, pS, qN, qS) and the four matching functions (ϕN, ϕS, ψN, ψS). Clearly, the exact marital patterns characterizing the stable matching depend on the joint distribution of SES and smoking status of the two populations. Here, we provide the general tool that can be used to solve the problem with arbitrary distributions. The idea, again, is to exploit the duality between stability and surplus maximization. With the previous notations, aggregate surplus is   \begin{align} \Sigma & =\int _{0}^{1}\left[ \left( 1-p_{N}\left( t\right) \right) S\left( t,\phi _{N}\left( t\right) \right) +\lambda p_{N}\left( t\right) S\left( t,\psi _{N}\left( t\right) \right) \right] dF_{N}\left( t\right) \\ &\quad {} +\lambda \int _{0}^{1}\left[ \left( 1-p_{S}\left( t\right) \right) S\left( t,\phi _{S}\left( t\right) \right) +p_{S}\left( t\right) S\left( t,\psi _{S}\left( t\right) \right) \right] dF_{S}\left( t\right). \nonumber \end{align} (8) The first integral considers the contribution of the female nonsmoker population. An individual with SES t may (with probability pN(t)) be matched with a smoker with SES ψN(t), in which case the couple generates a surplus λS(t, ψN(t)); alternatively, she may (with probability 1 − pN(t)) be matched with a nonsmoker with SES ϕN(t), generating a surplus S(t, ϕN(t)). Similarly, the second integral represents the contribution of the female smoker population to total surplus. A stable matching, defined by the functions pN, pS, qN, qS and ϕN, ϕS, ψN, ψS, linked by (4)–(7), maximizes aggregate surplus under the constraints   \begin{equation*} 0 \leq p_{A}(t) \leq 1,0 \leq q_{A}(t) \leq 1,0 \leq \phi _{A}(t) \leq 1,0 \leq \psi _{A}(t) \leq 1, \end{equation*} where A = N, S. The stable matching can therefore be derived as a solution to a maximization (optimal control) problem. Finally, once the functions pN, pS, qN, qS (or equivalently ϕN, ϕS, ψN, ψS), which define the stable matching, have been computed, one can readily recover the intracouple allocation of the surplus (see Appendix). 2.5. The Constrained General Case We now introduce the additional assumption that “men smoke more than women”, in the following sense. Assumption MP (Male Prevalence). $${For\, any\, SES\, level}$$x,   \begin{equation*} F_{S}\left( x\right) \ge G_{S}\left( x\right). \end{equation*} In words, for any SES x, there are more male than female smokers with SES x or larger. This may describe a situation in which men are more likely to smoke, but also one in which men and women have similar smoking habits but the male SES distribution first degree dominates the female one, as well as many others. As we shall see in the empirical part, the MP Assumption is satisfied in the data. Under that assumption, it is possible to characterize the stable matching as follows. Proposition 5. Assume that the upper bound of the support of the measures FS, FN, GS and GN is 1. Then there exist thresholds xN, xS, yN and yS in [0, 1) such that, for almost all xN ≤ x ≤ 1, xS ≤ x΄ ≤ 1, yN ≤ y ≤ 1 and yS ≤ y΄ ≤ 1,   \begin{equation*} p_{N}\left( x\right) =q_{N}\left( y\right) =0\quad{\textit{and}}\quad p_{S}\left( x^{\prime }\right) =q_{S}\left( y^{\prime }\right) =1; \end{equation*} for almost all x < xN, x΄ < xS, y < yN and y΄ < yS,   \begin{equation*} 0\le p_{N}\left( x\right) \le 1,0\le q_{S}\left( y^{\prime }\right) \le 1,q_{N}\left( y\right) =0{\textit{and}}p_{S}\left( x^{\prime }\right) =1; \end{equation*} for any x such that pN(x) > 0, let ϕN(x) and ψN(x), respectively, denote the nonsmoker and smoker mate of (x, N). Then   \begin{equation*} \psi _{N}\left( x\right) >\phi _{N}\left( x\right); \end{equation*} for any y΄ such that qS (y΄) < 1, let ϕS (y΄) and ψS (y΄), respectively, denote the nonsmoker and smoker mate of (y΄, S). Then   \begin{equation*} \phi _{S}\left( y^{\prime }\right) =\psi _{S}\left( y^{\prime }\right). \end{equation*} $$\textit{Proof}$$. See Appendix Proposition 5 first states that high SES people are exclusively matched to partners with identical smoking habits—a result already derived by Proposition 3. For lower SES, randomization may happen, as Proposition 4 indicates, but only in one direction: nonsmoking women may marry smokers, but smoking women and nonsmoking men always marry their own. Lastly, when a female nonsmoker may marry both a smoker and a nonsmoker with positive probability, the smoker has a higher SES than the nonsmoker. A remark coming directly from the optimal control formulation sheds some light on the form of the randomization. The Hamiltonian corresponding to the program (8) is linear in pN (x), the coefficient being equal to λS (x, ψN(x)) − S(x, ϕN (x)), which is always nonpositive. For large values of x, ψN (x) and ϕN (x) are both close to 1, and since λ < 1 the coefficient is negative, implying pN (x) = 0. Randomization requires the coefficient to be zero; therefore, whenever randomization takes place, it must be the case that   \begin{equation*} S\left( x,\phi _{N}\left( x\right) \right) =\lambda S\left( x,\psi _{N}\left( x\right) \right) \end{equation*} (which, incidentally, implies ψN(x) > ϕN(x)). Since both ϕN(x) and ψN(x) depend on the probability pN(x) (by 4 and 5), pN(x) must be such that this equality holds over the relevant domain. In addition, the intracouple allocation of the surplus can be recovered (see Appendix). Finally, we note that where exactly randomization occurs, and with which probability, depends on additional assumptions on the exact form of the surplus function and the two distributions. As an illustration, we provide an example in the Supplementary Material, where (i) smoking status and SES are independent, (ii) men are more likely to be smokers than women at all SES levels, and (iii) the surplus S is quadratic and symmetric. In that case, the program can be solved in closed form. There is a threshold below which nonsmoking women randomize their mates; moreover, the probability does not depend on the SES. More complex patterns can be constructed by varying the distributions of characteristics. 2.5.1. Theoretical Predictions In summary, the theoretical predictions of the constrained general case are as follows, Prediction 1. There are no “mixed” couples in which she smokes and he does not, whereas the opposite pattern—he smokes and she does not—happens with positive probability. Prediction 2. Smoking husbands married to smokers are of higher “quality” (i.e., higher SES) than those married to nonsmokers: smoking “premium” for smoking wives. Prediction 3. Nonsmoking wives married to smoking husbands have a lower SES than those married to nonsmokers: smoking “penalty” for smoking husbands. Prediction 4. There is positive assortative matching on SES among couples with identical smoking habits. Prediction 5. Positive assortative matching on SES by smoking status is stronger at the top of the SES distribution: smoking husbands (wives) at the top of the SES distribution are more likely to marry smoking wives (husbands) than those with lower SES. Similarly, nonsmoking husbands (wives) at the top of the SES distribution are more likely to marry nonsmoking wives (husbands) than those with lower SES. Prediction 6. When a female nonsmoker marries both a smoker and a nonsmoker with positive probability, the smoking husband has a higher SES than the nonsmoking one. When a male smoker marries both a smoker and a nonsmoker with positive probability, both women have the same SES. 3. Data Description 3.1. CPS March and TUS data Our empirical application uses years of education as the measure of SES and smoking status for the binary attribute. We use data from the US Current Population Survey (CPS), specifically its annual March supplements and the Tobacco Use Supplements (TUS), for the years 1996–2003, which provide the largest samples of married couples for whom information on tobacco use is available. The standard demographic and education variables are extracted from the annual March CPS supplements, to which data on smoking status are merged from the TUS. The TUS are monthly CPS supplements available discontinuously over time and in different months. Specifically, the available TUS of interest are January and May 1996, 1999, 2000; June 2001; February 2002; and February and June 2003. The CPS is a series of monthly cross sections, with a short longitudinal component. Individuals in the sample are interviewed eight times–four times, followed by a break of eight months, and then interviewed for the same four months the following year. As such, it is possible to match observations of the same individuals across months, using the household and person identification codes, along with the month-in-sample information. However, several observations are dropped due to the specific design of the rotation samples by 4-month periods. In addition, we also check for age, gender, and race, to ascertain that the merged observations consistently belong to the same individual.11 The TUS-CPS is a National Cancer Institute sponsored survey of tobacco use and policy information that has been administered as part of the CPS since 1992. It is considered a key and reliable source of national, state, and substate level data on smoking and other tobacco use in the US households, which are widely used in medical research on cancer and other consequences of smoking (e.g., Delnevo and Bauer 2009; Mills et al. 2009). It provides data on a nationally representative sample of about 240,000 civilian, noninstitutionalized individuals aged 15 years and older. We are able to match individuals across months, merging the TUS supplements back to the March supplement of the corresponding year, to build a series of repeated cross-sections for the years 1996, 1999, 2000, 2001, 2002, and 2003. Due to the CPS rotation sample design described above, the sample size of each match is at most 1/4, 1/2 , or 3/4, of the original March sample size (when matched to June, January and May, or February, respectively). In general, the farther from March the TUS supplement month is, the fewer observations can be matched, with the strong restriction that the TUS months of September (1992, 1995, 1998), and November (2001 and 2003) cannot be merged back to March, as they do not share any respondent (Madrian and Lefgren 1999). Nevertheless, our sample is large, with detailed socioeconomic and smoking information on both spouses, and (to the best of our knowledge) it is the first time it is used to study marriage and smoking. We specifically extract husbands and wives from our merged CPS files. Married individual records of the reference person and her spouse are then matched on the household identification code (and household number) to create a single observation for each couple, keeping only observations of couples who lived in households with only one family. Our main sample of husbands and wives consists of white couples, where the wife is between 22 and 32 years old and the husband is between 24 and 34 years old. This demographic group allows us to focus on recently married couples. In the United States the median age at first marriage was around 27 for men and 25 for women for the period 1996–2007. On the other hand, a lower bound of 22 and 24 years old also allows us to include college graduates after they have completed their schooling. The additional two years in the husbands’ bounds are based on the standard median/mean age difference of two years between male and female spouse (Chiappori, Iyigun, and Weiss 2009). Note that the March CPS does not record the duration of marriage; in particular, the June Fertility Supplements that used to provide the age at (first) marriage, do not contain it any longer in the most recent years that our study is concerned about. In addition to individual age, we use the state of residence, year of interview, sample household weight and education of the individual. From 1992, the CPS records education as degrees attained rather than years of schooling completed, we thus assign the number of years of schooling to the corresponding degrees. We will also consider information on self-reported health status, number of children under age six and male-head indicator. March CPS household weights are used to make our sample of couples representative of the US population. From the Tobacco Use Supplement, we retrieve information on the smoking status of each individual. Specifically, we focus on the respondents’ answers to whether and how often they smoke, whether they have smoked at least 100 cigarettes in their lifetime, and at which age they started smoking regularly. From the first two questions, we construct a dummy variable of smoking status, defining a person as a smoker if she reports to smoke every day or some days, and has smoked at least 100 cigarettes in her lifetime, and as nonsmokers those who say that they never smoke, or those who have smoked less than 100 cigarettes in their lifetime (CDC definition, NCHS 2010). Age at which they started smoking regularly, as a predetermined explanatory variable of current smoking status with respect to marital status, will prove useful in dealing with the potential endogeneity of smoking with respect to marital status. Finally, note that each spouse directly reports his/her information, and self-reporting of smoking status is considered a reliable source of information, as it is found to be validated by measured serum cotinine levels (Caraballo et al. 2001). 3.2. Smoking Prevalence by Gender An interesting aspect of the smoking attribute is the asymmetry in the smoking prevalence across genders. In the United States, and actually in many countries, male smokers largely outnumber female smokers, a discrepancy that has remained stable over the last decades. This gender asymmetry has been emphasized by the Surgeon General (e.g., Surgeon General’s Report 2001), as well as by several studies in various fields (e.g., Gruber 2001 in economics and Öberg et al. 2011 in medicine). Although a negative smoking gradient is observed by education (e.g., Gruber 2001), the gender gap in smoking prevalence is maintained across all education levels. For instance, in 2007, the prevalence of smokers by educational level among white men and women 25 years of age and over were as follows: 30.8% versus 23.9% for those with less than high-school; 29.9% versus 25.2% for those with high-school; 21.8% versus 19.6% for those with some college; and 10.5% versus 8.2% for those with college or above (NCHS 2010). We further investigate this gender smoking asymmetry by computing the number of smokers by gender and educational attainment (years of education) in our data. Figure 1 shows that the number of male smokers is always higher than the number of female smokers, for each educational attainment (years of education). Hence, the Assumption MP is satisfied in our data. Figure 1. View largeDownload slide Number of smokers by educational level and gender. Married. Figure 1. View largeDownload slide Number of smokers by educational level and gender. Married. 3.3. Descriptive Statistics Table 1 displays the summary statistics of the main variables of interest in our sample. Panel A reveals that women are slightly more educated and slightly less likely to be in excellent or very good health than their husbands, whereas their average number of children under six years old is about 0.8. Panel B reports the same statistics for never married individuals, showing again that men are healthier and less educated than women, and with a negligible number of young children. Finally, note the asymmetry in the smoking prevalence across genders: the smoking prevalence is 22% for husbands versus 17% for wives (29% and 26% for never-married men and women). Table 1. Summary statistics. Means (standard deviations)      A. Married  Husbands  Wives  Age (years)  29.48  27.81    (2.80)  (2.78)  Education (years)  13.65  13.79    (2.37)  (2.30)  Smoke (=1 if every day/some day smoker and 100 cigarettes in their lifetime)  0.22  0.17    (0.42)  (0.38)  Very healthy (=1 if excellent or very good self-reported health)  0.83  0.80    (0.38)  (0.40)  Number of children under age 6  0.83    (0.85)  Number of couples  10,305  B. Never married  Men  Women  Age (years)  28.24  25.91    (3.12)  (3.06)  Education (years)  13.77  13.99    (2.28)  (2.22)  Smoke (=1 if every day / some day smoker and 100 cigarettes in their lifetime)  0.29  0.26    (0.45)  (0.44)  Very Healthy (=1 if excellent or very good self-reported health)  0.79  0.77    (0.41)  (0.42)  Number of children under age 6  0.04  0.15    (0.24)  (0.44)  Number of individuals  8,990  9,361  Means (standard deviations)      A. Married  Husbands  Wives  Age (years)  29.48  27.81    (2.80)  (2.78)  Education (years)  13.65  13.79    (2.37)  (2.30)  Smoke (=1 if every day/some day smoker and 100 cigarettes in their lifetime)  0.22  0.17    (0.42)  (0.38)  Very healthy (=1 if excellent or very good self-reported health)  0.83  0.80    (0.38)  (0.40)  Number of children under age 6  0.83    (0.85)  Number of couples  10,305  B. Never married  Men  Women  Age (years)  28.24  25.91    (3.12)  (3.06)  Education (years)  13.77  13.99    (2.28)  (2.22)  Smoke (=1 if every day / some day smoker and 100 cigarettes in their lifetime)  0.29  0.26    (0.45)  (0.44)  Very Healthy (=1 if excellent or very good self-reported health)  0.79  0.77    (0.41)  (0.42)  Number of children under age 6  0.04  0.15    (0.24)  (0.44)  Number of individuals  8,990  9,361  Notes: CPS 1996–2003, men aged 24–34, women aged 22–32. Sampling weights are used. View Large 4. Empirical Analysis 4.1. From Theory to Data It is well known by now that matching models are difficult to test from the sole observation of matching patterns. For instance, the Choo–Siow formulation, where the matching problem is analyzed as a series of discrete choice models, is exactly identified and therefore cannot be rejected by the data, even with its strong parametric assumption, that is, that unobserved heterogeneity follows a type 1 extreme value distribution. Here, we suggest that an alternative approach may be feasible. In more complex settings, including multidimensional contexts, matching theory can generate strong qualitative predictions. In some cases—including the present one—hypotheses based on these predictions can be tested in reduced form. Specifically, the theoretical analysis suggests that matching patterns should exhibit specific features due to the underlying competitive structure. These features could in principle be formally derived from an explicit, specific stochastic structure, which can be based on a Choo–Siow setting (following a line of research initiated by Graham 2011), but not in general; an alternative justification would involve a “rank order” property à la Fox (2010), or possibly search models. In any case, we argue that if the Assumption MP is satisfied, then we expect the following regularities to hold. Hypothesis 1 (Relative Prevalence of Mixed Couples). Mixed couples in which the wife smokes (denoted (0,1)) should be less frequent than those in which the husband smokes (denoted (1,0)). Actually, the Assumption MP by itself implies an asymmetry between the proportions of mixed couples: if there are more male than female smokers, then one expects more (1,0) couples. To take an easy benchmark, assume that the proportion of smokers in the male (resp. female) population is equal to sM (resp. sW) for all SES levels; then if matching were random with respect to smoking habits, the ratio of (0,1) to (1,0) couples should be   \begin{equation*} r_{\mathit {observed}}<r_{\mathit {random}}=\frac{s_{W}\left( 1-s_{M}\right) }{s_{M}\left( 1-s_{W}\right) } \end{equation*} In our total sample, sM is 0.22 and sW is 0.17, so the ratio r implied by the sole difference in relative smoking prevalence is around 0.72; we expect the observed ratio to be significantly smaller than this threshold. Moreover, the same exercise can be performed for each “marriage market” (or region). Hypothesis 2 (Smoking “Premium” for Smoking Wives). Among smoking husbands, those who marry smoking wives should have (on average) a higher SES than those who marry nonsmokers. Hypothesis 3 (Smoking “Penalty” for Smoking Husbands). Among nonsmoking wives, those who marry smoking husbands should have (on average) a lower SES than those who marry nonsmokers. Hypothesis 4 (Assortativeness by SES among couples with identical smoking habits). Among couples with identical smoking habits (i.e., both smokers or both nonsmokers), matching should be assortative on SES. Hypothesis 5 (Stronger Assortativeness by Smoking Status at the Top of the SES Distribution). Smoking husbands (wives) at the top of the SES distribution are more likely to marry smoking wives (husbands) than those with lower SES. Similarly, nonsmoking husbands (wives) at the top of the SES distribution are more likely to marry nonsmoking wives (husbands) than those with lower SES. Hypothesis 6 (Conditional versus Unconditional Correlation between Smoking and SES). When two nonsmoking women with the same SES marry, respectively, a smoker and a nonsmoker, the nonsmoker should be on average of lower SES than the smoker. That is, controlling for the (nonsmoking) wife’s SES, the smoking habit of the husband should be positively correlated with his SES. This result seems counterintuitive, since a well-established empirical fact is that the (unconditional) correlation between smoking and education is negative: low SES is an important predictor of smoking behavior (e.g., CDC 2010). In the theoretical framework we use, however, (perfect) assortative matching on SES implies that, for given smoking habits, the husband’s SES is fully determined by his wife’s. In reality, agents match on several characteristics (many of them unobservable), so that the wife’s SES is not a perfect control for the husband’s, meaning that the (unconditional) negative correlation between smoking and SES is likely to persist. We therefore restate the prediction as follows: the conditional correlation between male SES and smoking status, given the SES of nonsmoking wives, should be less negative than the unconditional one. Note that this pattern should not hold true for women. The remaining part of the paper is devoted to testing these hypotheses. 4.2. Testing the Hypotheses 4.2.1. Relative Prevalence of Mixed Couples Table 2 reports both the observed (Panel A) and random (Panel B) matching patterns by smoking status for husbands and wives. Panel A reveals that there is strong assortative mating by smoking status: about 71% of couples have nonsmoking spouses, and roughly 10% consists of smokers. This is in line with previous evidence on marital sorting by smoking status (e.g., Clark and Etilé 2006). In addition, there are fewer mixed couples where the wife smokes than vice versa, 6.7% versus 11.93%. These figures are very different from the percentages arising from random matching that are reported in Panel B, where sorting is weaker (the percentage of couples in the main diagonal is now less than 70%) and the two types of mixed couples are much more prevalent and similarly represented. Indeed, Panel C shows that the ratio of observed mixed couples where the wife smokes than vice versa is 0.56 (s.e. = 0.03), which is statistically significantly lower than the 0.72 (s.e. = 0.02) implied by the sole difference in relative smoking prevalence, consistently with our Hypothesis 1.12 As a last test, in Panel D, we perform the same exercise at the US Geographical Census division level (which consists of nine regions). The results are quite explicit: we find that the prediction is satisfied in each of these regions. Under the null of random matching, the probability of such an outcome would be extremely small, that is, 2−9≅2 × 10−3. Table 2. Matching patterns by smoking status. A. Observed  Nonsmoking wife  Smoking wife  Nonsmoking husband  70.93%  6.70%    (7,390)  (683)  Smoking husband  11.93%  10.44%    (1,211)  (1,021)  B. Random  Nonsmoking wife  Smoking wife  Nonsmoking husband  64.32%  13.30%  Smoking husband  18.53%  3.83%  C. Prevalence ratios of mixed couples  Observed  Random  Ratios  $$\frac{6.70}{11.93}=0.56$$  $$\frac{13.30}{18.53}=0.72$$    [0.03]  [0.02]  D. Prevalence ratios of mixed couples by Census Division  Observed  Random  Ratios in Census Division 1 (New England)  0.57  0.72    [0.12]  [0.08]  Ratios in Census Division 2 (Middle Atlantic)  0.63  0.73    [0.09]  [0.07]  Ratios in Census Division 3 (East North Central)  0.67  0.80    [0.09]  [0.06]  Ratios in Census Division 4 (West North Central)  0.62  0.75    [0.11]  [0.08]  Ratios in Census Division 5 (South Atlantic)  0.43  0.63    [0.07]  [0.05]  Ratios in Census Division 6 (East South Central)  0.58  0.75    [0.12]  [0.08]  Ratios in Census Division 7 (West South Central)  0.48  0.67    [0.08]  [0.06]  Ratios in Census Division 8 (Mountain)  0.55  0.70    [0.09]  [0.07]  Ratios in Census Division 9 (Pacific)  0.56  0.69    [0.10]  [0.08]  A. Observed  Nonsmoking wife  Smoking wife  Nonsmoking husband  70.93%  6.70%    (7,390)  (683)  Smoking husband  11.93%  10.44%    (1,211)  (1,021)  B. Random  Nonsmoking wife  Smoking wife  Nonsmoking husband  64.32%  13.30%  Smoking husband  18.53%  3.83%  C. Prevalence ratios of mixed couples  Observed  Random  Ratios  $$\frac{6.70}{11.93}=0.56$$  $$\frac{13.30}{18.53}=0.72$$    [0.03]  [0.02]  D. Prevalence ratios of mixed couples by Census Division  Observed  Random  Ratios in Census Division 1 (New England)  0.57  0.72    [0.12]  [0.08]  Ratios in Census Division 2 (Middle Atlantic)  0.63  0.73    [0.09]  [0.07]  Ratios in Census Division 3 (East North Central)  0.67  0.80    [0.09]  [0.06]  Ratios in Census Division 4 (West North Central)  0.62  0.75    [0.11]  [0.08]  Ratios in Census Division 5 (South Atlantic)  0.43  0.63    [0.07]  [0.05]  Ratios in Census Division 6 (East South Central)  0.58  0.75    [0.12]  [0.08]  Ratios in Census Division 7 (West South Central)  0.48  0.67    [0.08]  [0.06]  Ratios in Census Division 8 (Mountain)  0.55  0.70    [0.09]  [0.07]  Ratios in Census Division 9 (Pacific)  0.56  0.69    [0.10]  [0.08]  Note: Sampling weights are used. Weighted % and (nonweighted number of observations). “Delta method” standard errors in brackets. View Large 4.2.2. Smoking Premium and Smoking Penalty We investigate Hypotheses 2 and 3 in Tables 3 and 4, respectively. They contain a series of regressions in which either the husband’s or wife’s education is the dependent variable whereas spouse’s education and smoking status are the main explanatory variables, for the samples of smoking husbands or smoking wives in Table 3, and of nonsmoking husbands or nonsmoking wives in Table 4. Two specifications are presented: a standard one, with controls for own age, year and state fixed effects; and another one with additional controls, where we also include a spousal very healthy indicator, number of children under age six and a male-head indicator. Table 3. Regressions of education on spousal smoking status for smokers.   Sample    Smoking wives  Smoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.002  −0.006  0.139*  0.151*    (0.092)  (0.092)  (0.080)  (0.080)  Spouse’s education  0.471***  0.461***  0.555***  0.542***    (0.029)  (0.030)  (0.028)  (0.028)  Standard controls  Yes  Yes  Yes  Yes  Additional controls  No  Yes  No  Yes  N  1,704  1,704  2,232  2,232  R2  0.28  0.29  0.37  0.37    Sample    Smoking wives  Smoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.002  −0.006  0.139*  0.151*    (0.092)  (0.092)  (0.080)  (0.080)  Spouse’s education  0.471***  0.461***  0.555***  0.542***    (0.029)  (0.030)  (0.028)  (0.028)  Standard controls  Yes  Yes  Yes  Yes  Additional controls  No  Yes  No  Yes  N  1,704  1,704  2,232  2,232  R2  0.28  0.29  0.37  0.37  Note: Standard controls: own age, year, and state fixed effects. Additional controls: spousal very healthy indicator, number of children under age 6, and a male-head indicator. Sampling weights are used. Robust standard errors in parentheses.***p-value < 0.01; *p-value < 0.1. View Large Table 4. Regressions of education on spousal smoking status for nonsmokers.   Sample    Nonsmoking wives  Nonsmoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.133**  −0.108*  −0.235***  −0.208***    (0.063)  (0.063)  (0.080)  (0.080)  Spouse’s education  0.636***  0.608***  0.684***  0.669***    (0.013)  (0.013)  (0.014)  (0.015)  Standard controls  Yes  Yes  Yes  Yes  Additional controls  No  Yes  No  Yes  N  8,601  8,601  8,073  8,073  R2  0.48  0.49  0.46  0.47    Sample    Nonsmoking wives  Nonsmoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.133**  −0.108*  −0.235***  −0.208***    (0.063)  (0.063)  (0.080)  (0.080)  Spouse’s education  0.636***  0.608***  0.684***  0.669***    (0.013)  (0.013)  (0.014)  (0.015)  Standard controls  Yes  Yes  Yes  Yes  Additional controls  No  Yes  No  Yes  N  8,601  8,601  8,073  8,073  R2  0.48  0.49  0.46  0.47  Note: Standard controls: own age, year, and state fixed effects. Additional controls: spousal very healthy indicator, number of children under age 6 and a male-head indicator. Sampling weights are used. Robust standard errors in parentheses. ***p-value < 0.01; **p-value < 0.05; *p-value < 0.1. View Large Starting from the left-block of regressions, columns (1) and (2) show that among smoking wives there is no statistically significant difference in the average years of completed education between those who marry smoking men and those who marry nonsmoking ones, and that the point estimates are virtually zero. The action is concentrated in columns (3) and (4): among smoking husbands those who marry smoking women have on average 0.15 more years of completed education (or about a 1.2% higher annual earnings13) than those with nonsmoking ones. This supports our Hypothesis 2. Our theoretical analysis shows that, given the shortage of smoking women, smoking men who marry smoking women should be more educated, whereas no such effect should be observed for women, which is what we observe in the “placebo” columns (1) and (2). Smoking men have to compete for smoking partners with their own education. In other words, smoking women “marry up” with respect to their own education; they “benefit” from the fact that they are on the short side of the market by marrying a higher “quality” spouse. Note that the sign of the wife’s smoker coefficient in columns (3) and (4) is opposite to the well-known negative gradient between own smoking and own education. The evidence presented in Table 3 is also supportive of our assumptions on the surplus reduction due to smoking. If, for instance, smokers also preferred nonsmokers then we would not observe the positive coefficient of columns (3) and (4). By the same token, our empirical evidence also shows that a gender asymmetric perception of smoking cannot be the driving force behind the observed matching patterns. If men, regardless of their smoking status, perceived smoking in a woman as a defect, we would not observe the positive coefficient of columns (3) and (4) either. We now turn to investigate Hypothesis 3. Table 4 inquires about the average education among nonsmoking wives and nonsmoking husbands depending on the smoking status of their spouses. Consistent with Hypothesis 3, columns (1) and (2) show that among nonsmoking wives those with smoking husbands have on average 0.13–0.11 fewer years of completed education (or about a 0.9% lower annual earnings) than those with nonsmoking husbands. In other words, a smoking husband marries, on average, a worse nonsmoking spouse in terms of education than if he were to be a nonsmoker: smoking husbands, who are on the long side of the market, “marry down”, that is, they are penalized for their “handicap” with a lower “quality” spouse. This suggests that spousal smoking is a bad characteristic for nonsmokers, and that there is a marriage market “penalty” associated to it, in terms of lower socioeconomic standards. Finally, columns (3) and (4) show that among nonsmoking husbands those with nonsmoking wives have on average 0.24–0.21 more years of completed education than those with smoking wives. Thus, a nonsmoking wife marries, on average, a better nonsmoker spouse in terms of education. 4.2.3. Smoking Premium and Smoking Penalty controlling for the Propensity Score Overall, the estimates reported in Tables 3 and 4 are consistent with our model, and support hypotheses 2 and 3. A potential concern is that smoking status captures some characteristics that we do not observe in our data (e.g., personality traits) that would explain the above empirical findings. Although smoking is correlated with self-control and impatience, recent empirical evidence shows that some noncognitive skills and personality traits exhibit negative sorting in the marriage market, opposite to the strong positive sorting observed for smoking status. Dupuy and Galichon (2014) and Lundberg (2012) find that there is negative sorting in autonomy, that male extraversion is negatively correlated with female conscientiousness and risk attitudes, and male agreeableness is negatively correlated with female extraversion.14 Another concern is whether such empirical patterns (even when controlling for additional covariates) are the mere reflection of smoking behavior being endogenous to marital status, rather than a confirmation of the postulated marriage market mechanism. We explore such a possibility by using a simple and popular estimator in program evaluation (see Wooldridge 2002) obtained from an OLS regression of education on spousal smoking status controlling for the predicted probability that the spouse is a smoker (the propensity score). The spousal propensity score is the probability that the spouse is a smoker given a set of observable characteristics. To make this approach as appealing as possible (at least one of) the predictors must be predetermined with respect to marital status, and the age at which the spouse started smoking regularly seems a natural candidate. Hence, we predict the probability that the spouse j is a smoker ($$\mathit{Smoking}_{j}=1$$) given a set of spousal characteristics (Xj) using a probability model (Probit),   $$\mathit{Smoking}_{j}=\Phi (\boldsymbol {X}_{j}\boldsymbol {\beta })+\varepsilon _{j},$$ (9)where Φ is the standard normal cdf and Xj contains the following spousal characteristics: age (in categories), years of education, year fixed effects, state fixed effects, and age at which the spouse started smoking regularly. We code this last variable as 0 for never smokers or never regular smokers. For this reason, and the fact that people who start regularly using tobacco when they are younger are more likely to have trouble quitting than people who start later in life (Surgeon General’s Report 2012), we expect a nonlinear relationship between smoking status today and age at which the individual started smoking regularly, so that we also consider the square of this variable as an additional predictor. Once we have estimated the propensity scores $$\widehat{\mathit{Smoking}}_{j}$$ for men and women (see Table A1), we then estimate the following OLS regressions   $${\mathit{Education}}_{i}=\alpha +\delta {\mathit{Smoking}}_{j}+\gamma \widehat{\mathit {Smoking}}_{j}+u_{i}$$ (10)for four groups of individuals, smoking wives, smoking husbands, nonsmoking wives, and nonsmoking husbands, but also   $$\mathit{Education}_{i}=\alpha +\delta \mathit{Smoking}_{j}+\gamma _{1}\widehat{ \mathit{Smoking}}_{j} +\gamma _{2}\widehat{ \mathit{Smoking}}_{j}^{2}+u_{i}$$ (11)to account for potential nonlinearities of the conditional expectation function of education in the propensity score. The estimated propensity score plays the role of a control function. Reassuringly, the estimates displayed in Tables 5 and 6 give a similar picture to our main estimates in Tables 3 and 4, and are indeed consistent with existing research. Using information on engaged couples, newlyweds, and couples married for over 5 years, Sutton (1993) finds that similarities in smoking status were already present about the time of marriage. More recently, Banks et al. (2013), who use retrospective evidence from the Health and Retirement Study, show that most smoking behavior is initiated before marriage, so that smoking is not endogenous to marital status. Controlling for own age in equations (10) and (11) is immaterial for our findings (results available upon request). Table 5. Regressions of education on spousal smoking status for smokers. Controlling for Spousal Propensity to Smoke $$\widehat{P}( {\textit {Spouse}^{\prime }s}\ {\textit {characteristics}})$$    Sample    Smoking wives  Smoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.002  0.004  0.247*  0.243*    (0.150)  (0.164)  (0.130)  (0.138)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})$$  −0.581**  2.77***  −0.734***  3.17***    (0.243)  (0.797)  (0.207)  (0.707)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})^2$$  –  −4.86***  –  −5.87***      (0.962)    (0.897)  N  1,656  1,656  2,204  2,204  R2  0.01  0.03  0.01  0.03  Controlling for Spousal Propensity to Smoke $$\widehat{P}( {\textit {Spouse}^{\prime }s}\ {\textit {characteristics}})$$    Sample    Smoking wives  Smoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.002  0.004  0.247*  0.243*    (0.150)  (0.164)  (0.130)  (0.138)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})$$  −0.581**  2.77***  −0.734***  3.17***    (0.243)  (0.797)  (0.207)  (0.707)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})^2$$  –  −4.86***  –  −5.87***      (0.962)    (0.897)  N  1,656  1,656  2,204  2,204  R2  0.01  0.03  0.01  0.03  Note: $$\widehat{P}( \mathit{Spouse^{\prime}s characteristics})$$ is the predicted probability of a spouse smoking status using a Probit. $$\mathit{Spouse^{\prime }s characteristics}$$: age (in categories), years of education, age when he/she started smoking, and its square (in columns (2) and (4)), year, and state fixed effects. Sampling weights are used. Robust standard errors in parentheses. ***p-value < 0.01; **p-value < 0.05; *p-value < 0.1. View Large Table 6. Regressions of education on spousal smoking status for nonsmokers. Controlling for Spousal Propensity to Smoke $$\widehat{P}( {\textit {Spouse}^{\prime }s\ {\textit {characteristics}})}$$    Sample    Nonsmoking wives  Nonsmoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.228**  −0.237**  −0.462***  −0.446***    (0.101)  (0.096)  (0.112)  (0.111)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})$$  −1.42***  2.46***  −1.00***  0.709    (0.132)  (0.565)  (0.145)  (0.539)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})^2$$  –  −5.67***  –  −2.70***      (0.851)    (0.835)  N  8,529  8,529  8,038  8,038  R2  0.04  0.04  0.02  0.02  Controlling for Spousal Propensity to Smoke $$\widehat{P}( {\textit {Spouse}^{\prime }s\ {\textit {characteristics}})}$$    Sample    Nonsmoking wives  Nonsmoking husbands  Dependent variable:  Wife’s education  Husband’s education    (1)  (2)  (3)  (4)  Spouse is a smoker  −0.228**  −0.237**  −0.462***  −0.446***    (0.101)  (0.096)  (0.112)  (0.111)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})$$  −1.42***  2.46***  −1.00***  0.709    (0.132)  (0.565)  (0.145)  (0.539)  $$\widehat{P}( {\textit {Spouse}^{\prime }s {\textit {characteristics}}})^2$$  –  −5.67***  –  −2.70***      (0.851)    (0.835)  N  8,529  8,529  8,038  8,038  R2  0.04  0.04  0.02  0.02  Note: $$\widehat{P}(\mathit{Spouse^{\prime}s characteristics})$$ is the predicted probability of a spouse smoking status using a Probit. $$\mathit{Spouse^{\prime }s characteristics}$$: age (in categories), years of education, age when he/she started smoking, and its square (in columns (2) and (4)), year, and state fixed effects. Sampling weights are used. Robust standard errors in parentheses. ***p-value < 0.01; **p-value < 0.05. View Large 4.2.4. Assortativeness by SES among Couples with Identical Smoking Habits In Table 7 we investigate the degree of assortativeness by education among different types of couples depending on spouses’ smoking status. For each type of couple, we regress own education on spouse’s education controlling for own age, year, and state fixed effects. Our estimates reveal positive assortative matching by education for each type of couple. Although assortative mating by education has been extensively documented in the literature (Lam 1988; Mare 2008; Pencavel 1998; Qian 1998), here we show that it holds true within each spouses’ smoking category. This is consistent with Hypothesis 4. Table 7. Assortative matching by education. Regressions of education on spousal education by type of couple.    Both nonsmokers    Both smokers    Wife’s education  Husband’s education    Wife’s education  Husband’s education  Spouse’s education  0.641***  0.695***    0.464***  0.440***    (0.014)  (0.015)    (0.037)  (0.035)  N  7,390  7,390    1,021  1,021  R2  0.47  0.47    0.27  0.28    Smoking husband    Nonsmoking husband    Nonsmoking wife    Smoking wife    Wife’s  Husband’s    Wife’s  Husband’s    education  education    education  education  Spouse’s education  0.604***  0.619***    0.482***  0.488***    (0.034)  (0.040)    (0.047)  (0.044)  N  1,211  1,211    683  683  R2  0.44  0.44    0.33  0.31  Regressions of education on spousal education by type of couple.    Both nonsmokers    Both smokers    Wife’s education  Husband’s education    Wife’s education  Husband’s education  Spouse’s education  0.641***  0.695***    0.464***  0.440***    (0.014)  (0.015)    (0.037)  (0.035)  N  7,390  7,390    1,021  1,021  R2  0.47  0.47    0.27  0.28    Smoking husband    Nonsmoking husband    Nonsmoking wife    Smoking wife    Wife’s  Husband’s    Wife’s  Husband’s    education  education    education  education  Spouse’s education  0.604***  0.619***    0.482***  0.488***    (0.034)  (0.040)    (0.047)  (0.044)  N  1,211  1,211    683  683  R2  0.44  0.44    0.33  0.31  Notes: All regressions include own age, year, and state fixed effects. Sampling weights are used. Robust standard errors in parentheses. ***p-value < 0.01. View Large 4.2.5. Stronger Assortativeness by Smoking Status at the Top of the SES distribution In Table 8 we investigate how the degree of assortativeness by smoking status varies across the SES distribution. We run regressions of a spousal smoking status indicator on individual smoking status, individual SES, and the interaction of these two variables. We measure SES in two different ways: years of schooling and high-school degree or above (12 years of schooling or more). The results of these regressions suggest that, if anything, “mixed” couples (i.e., where one spouse is a smoker and the other is not) are less prevalent at high SES, supporting Hypothesis 5. Table 8. Assortativeness in smoking status by SES. Regressions of spousal smoking status    Wife is a smoker    Husband is a smoker    (1)  (2)    (3)  (4)  Education  −0.014***  –    −0.022***  –    (0.001)      (0.002)    Smoker  0.239***  0.311***    0.389***  0.375***    (0.074)  (0.032)    (0.103)  (0.045)  Education × smoker  0.009  –    0.004  –    (0.006)      (0.008)    High school (or more)  –  −0.034**    –  −0.105***      (0.016)      (0.020)  High school (or more) × smoker  –  0.073**    –  0.092*      (0.035)      (0.047)  N  10,305  10,305    10,305  10,305  R2  0.20  0.19    0.20  0.19  Regressions of spousal smoking status    Wife is a smoker    Husband is a smoker    (1)  (2)    (3)  (4)  Education  −0.014***  –    −0.022***  –    (0.001)      (0.002)    Smoker  0.239***  0.311***    0.389***  0.375***    (0.074)  (0.032)    (0.103)  (0.045)  Education × smoker  0.009  –    0.004  –    (0.006)      (0.008)    High school (or more)  –  −0.034**    –  −0.105***      (0.016)      (0.020)  High school (or more) × smoker  –  0.073**    –  0.092*      (0.035)      (0.047)  N  10,305  10,305    10,305  10,305  R2  0.20  0.19    0.20  0.19  Note: All regressions include spouse’s age, year, and state fixed effects. Sampling weights are used. Robust standard errors in parentheses. ***p-value < 0.01; **p-value < 0.05; *p-value < 0.1. View Large 4.2.6. Conditional versus Unconditional Correlations between Smoking and SES Finally, in Table 9 we run regressions of education on smoking status for two samples, nonsmoking women and smoking men, controlling or not for spousal SES. Our findings support Hypothesis 6. Among men married to nonsmoking women, smoking men tend to have (on average) a lower SES (1.3 fewer years of education) but this difference decreases (to 0.7 years less of education) when controlling for their wives’ SES. Among women married to smoking men, smoking women tend to have (on average) a lower SES (around 0.3 fewer years of education) regardless of their husbands’ SES. Table 9. Conditional and unconditional correlations. Regressions of education on smoking status    Sample    Nonsmoking women    Smoking men  Dependent variable:  Husband’s education    Wife’s education    Unconditional  Conditional    Unconditional  Conditional  Smoker  −1.34***  −0.657***    −0.339***  −0.314***    (0.076)  (0.063)    (0.094)  (0.080)  Spouse’s education  –  0.683***    –  0.559***      (0.014)      (0.025)  Adjusted Wald test            Test of equality (p-value)  0.0000    0.6291  N  8,601  8,601    2,232  2,232  Regressions of education on smoking status    Sample    Nonsmoking women    Smoking men  Dependent variable:  Husband’s education    Wife’s education    Unconditional  Conditional    Unconditional  Conditional  Smoker  −1.34***  −0.657***    −0.339***  −0.314***    (0.076)  (0.063)    (0.094)  (0.080)  Spouse’s education  –  0.683***    –  0.559***      (0.014)      (0.025)  Adjusted Wald test            Test of equality (p-value)  0.0000    0.6291  N  8,601  8,601    2,232  2,232  Note: All regressions include own age, year and state fixed effects. Sampling weights are used. Robust standard errors in parentheses. *** p-value < 0.01. View Large Table A.1. Propensity to smoke. Probit models of smoking status on explanatory variables    Wives    Husbands    (1)  (2)    (3)  (4)  Age individual started smoking regularly  0.146***  0.275***    0.156***  0.290***    (0.003)  (0.011)    (0.003)  (0.013)  Age individual started smoking regularly2  —  −0.006***    —  −0.007***      (0.0005)      (0.0006)  Education  −0.126***  −0.108***    −0.138***  −0.127***    (0.010)  (0.011)    (0.010)  (0.012)  Age fixed effects  Yes  Yes    Yes  Yes  Year fixed effects  Yes  Yes    Yes  Yes  State fixed effects  Yes  Yes    Yes  Yes  N  10,185  10,185    10,242  10,242  Pseudo-R2  0.51  0.53    0.54  0.55  Probit models of smoking status on explanatory variables    Wives    Husbands    (1)  (2)    (3)  (4)  Age individual started smoking regularly  0.146***  0.275***    0.156***  0.290***    (0.003)  (0.011)    (0.003)  (0.013)  Age individual started smoking regularly2  —  −0.006***    —  −0.007***      (0.0005)      (0.0006)  Education  −0.126***  −0.108***    −0.138***  −0.127***    (0.010)  (0.011)    (0.010)  (0.012)  Age fixed effects  Yes  Yes    Yes  Yes  Year fixed effects  Yes  Yes    Yes  Yes  State fixed effects  Yes  Yes    Yes  Yes  N  10,185  10,185    10,242  10,242  Pseudo-R2  0.51  0.53    0.54  0.55  Note: Sampling weights are used. Robust standard errors in parentheses. ***p-value < 0.01. View Large 4.3. Refutability Tests We implement two types of refutability tests (see Angrist and Krueger 2001). We first provide placebo tests based on the idea that the predictions of our model should not be borne out in the data when married men and women are randomly assigned to couples. In our second approach, we test the proportionality constraints implied by an alternative single-index model. 4.3.1. Placebo Tests We reshuffle our observed married individuals into new randomly married couples. The way we do it is in two stages. We first keep married men in our sample and generate a new id drawn from a uniform distribution for each of these men, and then rank them according to that pseudorandom id. We do the same for married women: we generate a new id drawn from a uniform distribution for each of these women, and then rank them according to that pseudorandom id. We then merge men and women according to that pseudorandom id. This constitutes our random matching sample. After that, we investigate whether the hypotheses based on the predictions of our model are observed in these randomly generated data. The answer is clearly negative. Table G.1 in the Supplementary Material shows that with randomly generated couples, it is not true that among smoking husbands those who marry smoking wives have (on average) a higher SES. Table G.2 in the Supplementary Material shows that with randomly generated couples, it is not true that among nonsmoking wives, those who marry smoking husbands have (on average) a lower SES. Table G.3 in the Supplementary Material shows that with randomly generated couples, it is not true that “mixed” couples are less prevalent at high levels of SES. Finally, Table G.4 in the Supplementary Material reveals that with randomly generated couples, Hypothesis 6 is not borne out in the data. All in all, it seems that our model cannot explain the marriage patterns of randomly generated couples. 4.3.2. Alternative Single-Index Model Our bidimensional model of smoking and SES rules out a single index representation: the trade-off between these two characteristics is perceived differently among potential spouses, which is precisely what index models forbid. Chiappori et al. (2012) have shown that index models are testable; they must satisfy a set of proportionality restrictions. In Table G.5 in the Supplementary Material, we show that these predictions are rejected by the data when the two dimensions are smoking status and SES, providing additional support for the contention that, albeit very appealing, the single-index model representation is not always appropriate, and that, in particular, it does not apply to our case. 5. Discussion Our model parsimoniously assumes that the presence of a smoking spouse decreases surplus by the same factor irrespective of the partner’s smoking status. This assumption could be relaxed; one could assume, for instance, that the surplus discount is larger for mixed couples. The corresponding, more general model would still be solvable using the same technique. Most general conclusions would remain valid. In particular, there would be assortative matching at the top of the distribution; if male smokers outnumbered female ones, then mixed couples would consist of a male smoker and a nonsmoking wife; and in the latter case the wife would “marry up”. However, other predictions would be lost. For instance, the last statement of Proposition 5 would no longer hold; when a smoking man randomizes between two potential spouses, they will not have the same socioeconomic index (although, interestingly, they do in the data, see Table 6). In addition, the closed-form solution of the uniform-quadratic example provided in Supplementary Material would no longer be valid; its resolution, although still feasible, would become more complex (and tedious). For instance, it would require considering more cases. All in all, the new, and less tractable, model, would lose in empirical content (although the predictions that would be discarded are supported by the data), with no significant gain in terms of its basic insights. We therefore believe that parsimony considerations suggest concentrating on the single λ case. 6. Conclusions We develop a bidimensional frictionless matching model on the marriage market under transferable utility, where individuals are characterized by a continuous trait (e.g., socioeconomic status) and a binary attribute (e.g., smoking status) that is heterogeneously (dis)liked in the population. As long as traits are not perfectly correlated, this heterogeneity is key: it rules out a single-index representation. The trade-off between the two characteristics is perceived differently among potential spouses, which is precisely what index models forbid. That is, the corresponding one-dimensional matching techniques are not well-suited to characterize the stable matches in this setting, and this paper precisely characterizes a specific extension to allow for heterogeneous preferences. The main message of our paper is twofold. First, specific multidimensional models of matching, although intrinsically more complex than one-dimensional ones, are by no means intractable. We actually describe a general strategy for tackling problems of this type, and in particular how the characterization of the equilibrium can be formulated as an optimal control problem; we also show how such a theoretical approach can generate strong predictions on matching patterns. Although we concentrate on smoking in our empirical application, other aspects could readily be considered. Race or ethnicity are important examples (e.g., Chiappori, Oreffice, and Quintana-Domeque 2016), or characteristics such as age or health attributes. Secondly, although one-dimensional models are difficult to test, at least in the absence of additional information on either match-specific surpluses or transfers between agents, in more complex settings, and particularly in multidimensional contexts, the strong qualitative predictions that matching theory generates can be taken to the data. Hypotheses based on these predictions can be tested in reduced form. In our case, the model predicts a series of specific patterns (from the relative scarcity of couples where the wife smokes and the husband does not, to the opposite association of a smoking spouse on one’s education for wives and husbands) that are difficult to justify otherwise, and can be directly tested in a simple and robust way. We do not view such a strategy as a substitute for more explicitly structural approaches, but we believe it provides an informative complement. Notes The editor in charge of this paper was Juuso Välimäki. Acknowledgements We thank Juuso Välimäki (the editor) and two anonymous reviewers for helpful comments and suggestions. We also thank the participants at the Family Economics Workshop RHUL 2014, “The Econometrics of Matching” invited session of the EEA 2013, SaM Mainz 2013, Paris Matching Workshop 2013, Royal Economic Society Meetings 2012, Barcelona MOVE Family Conference 2011, and seminars at Ecole Polytechnique, CEPS/INSTEAD, Lancaster University, LSE, University of Oxford, University of Chicago, Columbia University, University of Aarhus, Universitat d’Alacant, Universitat Pompeu Fabra and Universidad Carlos III de Madrid. Chiappori acknowledges financial support from the NSF (award # SES-1124277). Oreffice and Quintana-Domeque acknowledge financial support from the Spanish Ministry of Science and Innovation (ECO 2011-29751). The usual disclaimers apply. Appendix A: Proof of Proposition 3 Take some small ε > 0 such that   \begin{equation*} S\left( 1-\varepsilon ,1-\varepsilon \right) >\lambda S\left( 1,1\right). \end{equation*} Define η(ε) > 0 by   $$\int _{1-\varepsilon }^{1}dF_{N}\left( s\right) =\int _{1-\eta \left( \varepsilon \right) }^{1}dG_{N}\left( s\right) ,$$ (A.1)so that there are exactly as many nonsmoking men with SES above 1 − η(ε) as nonsmoking women with SES above 1 − ε. We claim that almost all female nonsmokers with SES at least 1 − ε are married with a male nonsmoker with SES at least 1 − η(ε) (note that (A.1) then implies that, conversely, almost all male nonsmokers with SES at least 1 − η(ε) are married with a female nonsmoker with SES at least 1 − ε). Assume not, then there exists a positive measure set O of female nonsmokers with SES at least 1 − ε married with a smoker. By (A.1), there must exist a set O΄ of identical measure gathering male nonsmokers with SES at least 1 − η(ε), who are $${not}$$ married with female nonsmokers with SES at least 1 − ε. Then either almost all males in O΄ are married with nonsmokers with SES less than 1 − ε, or a non-null subset of males in O΄ is matched with smokers. We start with the second case. Let x ∈ O, y her (smoking) match, and y΄ ∈ O΄ matched with a smoker x΄. Surplus is   \begin{equation*} \Sigma =\lambda S\left( x,y\right) +\lambda S\left( x^{\prime },y^{\prime }\right), \end{equation*} whereas matching x and y΄ would generate a surplus   \begin{equation*} \Sigma _{1}=S\left( x,y^{\prime }\right) +\lambda S\left( x^{\prime },y\right). \end{equation*} By definition of ε, Σ1 > Σ, a contradiction. Assume now that almost all males in O΄ are married with nonsmokers with SES less than 1 − ε. Let x ∈ O be matched with yS, smoker, whereas y΄ ∈ O΄ is matched with a nonsmoking wife x΄ < 1 − ε. The surplus generated is thus   \begin{equation*} \Sigma =S\left( x^{\prime },y^{\prime }\right) +\lambda S\left( x,y_{S} \right), \end{equation*} whereas mixing matches would generate   \begin{equation*} \Sigma _{1}=S\left( x,y^{\prime }\right) +\lambda S\left( x^{\prime } ,y_{S}\right). \end{equation*} Note that yS > 1 − η(ε), for otherwise   \begin{eqnarray*} \Sigma _{1}-\Sigma & =S\left( x,y^{\prime }\right) -S\left( x^{\prime },y^{\prime }\right) -\lambda \left( S\left( x,y_{S}\right) -S\left( x^{\prime },y_{S}\right) \right) \\ & >S\left( x,y^{\prime }\right) -S\left( x^{\prime },y^{\prime }\right) -\left( S\left( x,y_{S}\right) -S\left( x^{\prime },y_{S}\right) \right) >0 \end{eqnarray*} by supermodularity, which contradicts surplus maximization. Define   \begin{equation*} \phi \left( s\right) =S\left( x,s\right) -S\left( x^{\prime },s\right) \end{equation*} then ϕ is differentiable and strictly positive on [0, 1]. We have that   \begin{equation*} \left|\phi \left( y^{\prime }\right) -\phi \left( y_{S}\right) \right|\le \left|y^{\prime }-y_{S}\right|M, \end{equation*} where $$M=\sup _{\left[ 0,1\right] }\left|\phi ^{\prime }\right|$$, and where |y΄ − yS| ≤ η(ε). It follows that   \begin{equation*} \phi \left( y_{S}\right) \le \phi \left( y^{\prime }\right) +\eta \left( \varepsilon \right) M, \end{equation*} therefore   \begin{equation*} \Sigma _{1}-\Sigma =\phi \left( y^{\prime }\right) -\lambda \phi \left( y_{S}\right) \ge \left( 1-\lambda \right) \phi \left( y^{\prime }\right) -\lambda \eta \left( \varepsilon \right) M, \end{equation*} which is positive for ε small enough, a contradiction. Appendix B: Proof of Proposition 4 The proof relies on the following Lemma. We can now prove the proposition. By Lemma 1, y΄ > y and x = x΄. Assume that qN(y) > 0, that is, that y marries a smoker x″ with positive probability. Then x″ > x = x΄ by Lemma 1. But the couples (x″, y) and (x΄, y΄) generate a surplus   \begin{equation*} \Sigma =\lambda S\left( x^{\prime \prime },y\right) +\lambda S\left( x^{\prime },y^{\prime }\right), \end{equation*} whereas the mixed couples (x΄, y) and (x″, y΄) would generate a surplus   \begin{equation*} \Sigma _{1}=\lambda S\left( x^{\prime },y\right) +\lambda S\left( x^{\prime \prime },y^{\prime }\right) \end{equation*} and Σ1 > Σ by supermodularity of S; an open set of marriages satisfying this pattern would violate surplus maximization. Similarly, assume that pS(x΄) < 1, that is, that x΄ marries a nonsmoker $$\bar{y}$$ with positive probability. Then $$\bar{y}=y^{\prime }>y$$ by Lemma 1. The couples (x, y) and $$\left( x^{\prime },\bar{y}\right)$$ generate a surplus   \begin{equation*} \Sigma =S\left( x,y\right) +\lambda S\left( x^{\prime },\bar{y}\right), \end{equation*} whereas the mixed couples $$\left( x,\bar{y}\right)$$ and (x΄, y) would generate a surplus   \begin{equation*} \Sigma _{1}=S\left( x,\bar{y}\right) +\lambda S\left( x^{\prime },y\right). \end{equation*} Since   \begin{equation*} S\left( x,\bar{y}\right) -S\left( x,y\right) =S\left( x^{\prime },\bar{y}\right) -S\left( x^{\prime },y\right) >\lambda \left( S\left( x^{\prime },\bar{y}\right) -S\left( x^{\prime },y\right) \right), \end{equation*} we have that Σ1 > Σ; again, an open set of marriages satisfying this pattern would violate surplus maximization. The proof of the last statement is identical. Appendix C: Proof of Proposition 5 The first statement is a direct consequence of Proposition 3. Similarly, the last two statements directly follow from Lemma 1 above. To establish the second, we need to show that any mixed marriage can only involve a nonsmoking woman and a male smoker. Under the Male Prevalence assumption, this directly follows from the proof of Proposition 4. Appendix D: Recovering Individual Utilities in The Stable Match D.1. The Symmetric Case Assume Ms. x (a nonsmoker) marries Mr. y (also a nonsmoker) at the stable match; note that x = y by the previous Proposition. Let uN(x) (resp. vN(y)) denote her (his) utility. Then   \begin{equation*} u_{N}\left( x\right) +v_{N}\left( x\right) =S\left( x,x\right). \end{equation*} By symmetry,   \begin{eqnarray*} u_{N}\left( x\right) & =&\frac{S\left( x,x\right) }{2},v_{N}\left( y\right) =\frac{S\left( y,y\right) }{2} \quad {\rm and\,\, similarly}\\ u_{S}\left( x\right) & =&\lambda \frac{S\left( x,x\right) }{2} ,v_{S}\left( y\right) =\lambda \frac{S\left( y,y\right) }{2}. \end{eqnarray*} D.2. The General Case Consider for instance a nonsmoker wife with SES x. If her husband is a nonsmoker with SES ϕN(x), then stability implies that   \begin{equation*} u_{N}\left( x\right) =\max _{y}\left( S\left( x,y\right) -v_{N}\left( y\right) \right) \end{equation*} the maximum being reached for y = ϕ(x). It follows, from the envelope theorem, that   \begin{equation*} u_{N}^{\prime }\left( x\right) =\frac{\partial S}{\partial x}, \end{equation*} where the right-hand side derivative is taken at the point (x, ϕN(x)). Then,   \begin{equation*} u_{N}\left( x\right) =\int _{0}^{x}\frac{\partial }{\partial x}S\left( t,\phi _{N}\left( t\right) \right) dt+K, \end{equation*} where K is a constant; and a similar expression obtains for the other utilities. The various utilities are therefore defined up to an additive constant each; the constants, in turn, are pinned down by the adding up property on the support of μ and the indifference conditions. In particular, it becomes possible to compute the difference uN(x) − uS(x), which can be interpreted as the cost of smoking on the marriage market —or equivalently as the gain that would result from quitting—for a woman with SES x. D.3. The Constrained General Case We now compute individual utilities at the stable matching, as a function of SES, gender and smoking habit. Specifically, Proposition D.1. Under the MP assumption, individual utilities uN(x), uS(x), vN(y), vS(y) are all increasing. For all x and all y,   \begin{equation*} u_{N}\left( x\right) \ge u_{S}\left( x\right)\, {\quad\textit {and}}\quad\,v_{N}\left( y\right) \ge v_{S}\left( y\right) \end{equation*} Moreover, whenever pN(x) > 0, then   \begin{equation*} u_{N}\left( x\right) =u_{S}\left( x\right). \end{equation*} Lastly, for x and y large enough, the differences uN(x) − uS(x) and vN(y) − vS(y) are increasing (in x and y, respectively). $$\textit{Proof}$$. Assume Ms. x marries either Mr. y (a nonsmoker) or Mr. y΄. The first inequality comes from the fact that, for any given SES, a nonsmoker can only be either an equivalent or a better partner than a smoker. Next, if pN(x) > 0, both a smoking and a nonsmoking wives marry the same smoking husband with positive probability. Since the total surplus is the same in both cases, their utility must be the same. Lastly, first order conditions give that   \begin{eqnarray*} u_{N}^{\prime }\left( x\right) &=&\frac{\partial }{\partial x}S\left( x,\phi _{N}\left( x\right) \right) >0,\\ u_{S}^{\prime }\left( x\right) &=&\lambda \frac{\partial }{\partial x}S\left( x,\psi _{S}\left( x\right) \right) >0, \end{eqnarray*} so that   \begin{equation*} \left( u_{N}\left( x\right) -u_{S}\left( x\right) \right) ^{\prime }=\frac{\partial }{\partial x}S\left( x,\phi _{N}\left( x\right) \right) -\lambda \frac{\partial }{\partial x}S\left( x,\psi _{S}\left( x\right) \right). \end{equation*} For x large enough, both ϕN(x) and ψS(x) are close to 1, and that difference is positive. In short, utility increases with the person’s SES, and smokers are never better off than nonsmokers. However, whenever randomization takes place, a woman’s welfare does not depend on her smoking status: both smokers and nonsmokers marry a smoking husband with positive probability, with whom they generate the same surplus. For men, on the other hand, welfare is always smaller for smokers. Finally, at the top of the SES distribution, the cost of being a smoker increases with social status. In the uniform-quadratic example we present in the Supplementary Material, things are even simpler: the cost, for a woman, of being a smoker (as measured by the difference uN(x) − uS(x)) is zero below the randomization threshold and increases with SES above it; for men, it is always positive and always increasing. Footnotes 1 See for instance Coles and Francesconi (2013) for a search Nontransferable utility (NTU) approach, Chiappori, Oreffice, and Quintana-Domeque (2012) for a general investigation. 2 There is ample medical evidence that second-hand smoking has detrimental health effects on nonsmokers but not on smokers (ASH 2011; CDC 2006; Glymour et al. 2008; Mannino et al. 1997). In addition, the attitude of smokers toward smokers is much more permissive than that of nonsmokers (ASH 2011; Lader 2009; Pilkington et al. 2006). 3 Among multidimensional empirical works, Lindenlaub (2014) considers a specific model in which the distribution of types is normal and the payoff is quadratic to study multidimensional sorting between workers and jobs in the labor market, whereas Galichon and Salanié (2010) and Dupuy and Galichon (2014) use a Choo and Siow (2006) framework. For the analysis of dating patterns using a Gale–Shapley NTU approach, see Banerjee et al. (2013) and Hitsch, Hortaçsu, and Ariely (2010). 4 Taking the model to the data in a structural way is far beyond the scope of this paper. Extending the Choo and Siow (2006) methodology to a multidimensional setting with discrete and continuous characteristics is still an open question, despite recent and promising advances (Chiappori, Salanié, and Weiss 2014; Dupuy and Galichon 2014; Galichon and Salanié 2010). 5 In a pure Choo–Siow setting, this hypothesis would follow from a result by Graham (2011). 6 The theoretical analysis of matching under TU is typical in marriage market analysis, whereas models analyzing dating typically consider NTU. Matching under TU dates back to Koopmans and Beckmann (1957), Shapley and Shubik (1971), and Becker (1973). In particular, the last two contributions show that the stable matching maximizes aggregate surplus, and that the associated individual surpluses solve the dual imputation problem. In turn, the surplus maximization problem belongs to the class of optimal transportation problems, which date back to Monge (1781) and Kantorovich (1942); see Villani (2003) and McCann and Guillen (2010) for recent presentations. The precise connection between matching models and optimal transportation has been analyzed by Gretsky, Ostroy, and Zame (1999) in the discrete case, and by Gretsky, Ostroy, and Zame (1992), Ekeland (2010), and Chiappori, McCann, and Nesheim (2010) in the continuous one. 7 See for instance ASH (2011), CDC (2006), Glymour et al. (2008), Mannino et al. (1997), among others. Moreover, the attitude of smokers toward smokers is much more permissive than that of nonsmokers (ASH 2011; Lader 2009; Pilkington et al. 2006), suggesting that beyond the impact on life expectancy, the psychological costs of a smoking partner are large for nonsmokers but negligible for smokers. 8 A natural extension would allow the surplus to be decreased by a smaller amount when both spouses smoke than when only one is a nonsmoker; that is, the surplus would be discounted by λ < 1 if both are smokers, and by λ΄ < λ if only one is a smoker. We provide a brief discussion in Section 5. 9 See Chiappori et al. (2010) for a complete presentation. 10 Examples can readily be constructed by using disconnected subpopulations among both men and women. 11 Madrian and Lefgren (1999) illustrate and explain the matching procedures to longitudinally merge the CPS respondents. 12 Standard errors are computed using the delta method. The difference in the ratios is statistically significant at the 1% (p-value = 0.0000). 13 Assuming that each extra year of education an individual gets is worth about an 8% increment to their annual earnings. 14 Lundberg (2012) also finds that extraversion and neuroticism only matter for women, whereas conscientiousness only for men, in terms of marriage probabilities. References Angrist J., Krueger A. ( 1999) “ Empirical Strategies in Labor Economics.” In Handbook of Labor Economics , Vol. 3A, edited by Ashenfelter O., Card D.. Elsevier, Amsterdam. Google Scholar CrossRef Search ADS   ASH ( 2011). “ Secondhand Smoke.” Research Report . Banerjee A., Duflo E., Ghatak M., Lafortune J. ( 2013). “ Marry for What? Caste and Mate Selection in Modern India”. American Economic Journal: Microeconomics , 5, 33– 72. Google Scholar CrossRef Search ADS   Banks J., Kelly E., J.P. Smith ( 2013). “ Spousal Health Effects: The Role of Selection.” In Discoveries in the Economics of Aging  (NBER Book Series), edited by Wise D.A.. University of Chicago Press. Becker G. ( 1973). “ A Theory of Marriage: Part I”. Journal of Political Economy , 81, 813– 846. Google Scholar CrossRef Search ADS   Becker G. ( 1991). A Treatise on the Family . Harvard University Press. Caraballo R.S., Giovino G.A., Pechacek T.F., Mowery P.D. ( 2001). “ Factors Associated with Discrepancies between Self-Reports on Cigarette Smoking and Measured Serum Cotinine Levels among Person Aged 17 Years or Older: Third National Health and Nutrition Examination Survey, 1988-1994”. American Journal of Epidemiology , 153, 807– 814. Google Scholar CrossRef Search ADS PubMed  >CDC ( 2006). “ The Health Consequences of Involuntary Exposure to Tobacco Smoke: A Report of the Surgeon General.” U.S. Government Printing Office, Washington, DC. CDC ( 2010). “ Vital Signs: Current Cigarette Smoking Among Adults Aged ≥18 Years—United States, 2009,” Morbidity and Mortality Weekly Report, CDC , 59( 35), 1135– 1140. Available from: http://www.cdc.gov/mmwr/preview/mmwrhtml/mm5935a3.htm?s_cid=mm5935a3_w. Accessed on July 2016. Chiappori P.-A., Salanié B. ( 2016) “ The Econometrics of Matching Models.” Journal of Economic Literature , 54, 832– 861. Google Scholar CrossRef Search ADS   Chiappori P.-A., Iyigun M., Weiss Y. ( 2009). “ Investment in Schooling and the Marriage Market”. American Economic Review , 99, 1689– 1713. Google Scholar CrossRef Search ADS   Chiappori P.-A., McCann R., Nesheim L. ( 2010). “ Hedonic Price Equilibria, Stable Matching, and Optimal Tansport: Equivalence, Topology, and Uniqueness”. Economic Theory , 42, 317– 354. Google Scholar CrossRef Search ADS   Chiappori P.-A., Oreffice S., Quintana-Domeque C. ( 2012). “ Fatter Attraction: Anthropometric and Socioeconomic Matching on the Marriage Market”. Journal of Political Economy , 120, 659– 695. Google Scholar CrossRef Search ADS   Chiappori P.-A., Oreffice S., Quintana-Domeque C. ( 2016). “ Black-White Marital Matching: Race, Anthropometrics, and Socioeconomics”. Journal of Demographic Economics , 82, 399– 421. Google Scholar CrossRef Search ADS   Chiappori P.-A., Salanié B., Weiss Y. ( 2014). “ Partner Choice and the Marital College Premium.” Working paper , Columbia University. Choo E., Siow A. ( 2006) “ Who Marries Whom and Why.” Journal of Political Economy , 114, 172– 201. Google Scholar CrossRef Search ADS   Clark A., Etilé F. ( 2006). “ Don’t Give Up on Me Baby: Spousal Correlation in Smoking Behaviour”. Journal of Health Economics , 25, 958– 978. Google Scholar CrossRef Search ADS PubMed  Coles M., M. Francesconi ( 2013). “ Equilibrium Search and the Impact of Equal Opportunities for Women.” Economics Discussion Papers 742 , University of Essex, Department of Economics. Delnevo C.D., Bauer U.E. ( 2009). “ Monitoring the Tobacco Use Epidemic III: The Host: Data Sources and Methodological Challenges”. Preventive Medicine , 48, S16– S23. Google Scholar CrossRef Search ADS PubMed  Dupuy A., Galichon A. ( 2014). “ Personality Traits and the Marriage Market”. Journal of Political Economy , 122, 1271– 1319. Google Scholar CrossRef Search ADS   Ekeland I. ( 2010). “ Existence, Uniqueness and Efficiency of Equilibrium in Hedonic Markets with Multidimensional Types”. Economic Theory , 42, 275– 315. Google Scholar CrossRef Search ADS   Fox J. ( 2010). “ Identification in Matching Games”. Quantitative Economics , 1, 203– 254. Google Scholar CrossRef Search ADS   Galichon A., Salanié B. ( 2010). “ Cupid’s Invisible Hand: Social Surplus and Identification in Matching Models.” Discussion Paper: 0910–0914 , Columbia University. Google Scholar CrossRef Search ADS   Glymour M., T. DeFries, I. Kawachi, M. Avendano ( 2008)“ Spousal Smoking and Incidence of First Stroke The Health and Retirement Study.” American Journal of Preventive Medicine , 35, 245– 248. Google Scholar CrossRef Search ADS PubMed  Graham B. ( 2011). “ Econometric Methods for the Analysis of Assignment Problems in the Presence of Complementarity and Social Spillovers.” In Handbook of Social Economics , Vol. 1B, edited by Benhabib J., Jackson M., Bisin A.. North-Holland, pp. 965– 1052. Google Scholar CrossRef Search ADS   Gretsky N., Ostroy J., Zame W. ( 1999). “ Perfect Competition in the Continuous Assignment Model”. Journal of Economic Theory , 88, 60– 118. Google Scholar CrossRef Search ADS   Gretsky N., Ostroy J., Zame W. ( 1992). “ The Nonatomic Assignment Model.” Economic Theory 2 , 103– 127. Gruber J. ( 2001). “ Tobacco at the Crossroads: The Past and Future of Smoking Regulation in the United States”. Journal of Economic Perspectives , 15( 2), 193– 212. Google Scholar CrossRef Search ADS   Hitsch G., Hortaçsu A., Ariely D. ( 2010). “ Matching and Sorting in Online Dating”. American Economic Review , 100( 1), 130– 163. Google Scholar CrossRef Search ADS   Kantorovich L. ( 1942). “ On the Translocation of Masses,” Dokl. Akad. Nauk SSSR , 37, 227– 229. Koopmans T.C., Beckmann M. ( 1957). “ Assignment Problems and the Location of Economic Activities”. Econometrica , 25, 53– 76. Google Scholar CrossRef Search ADS   Lader D. ( 2009). “ Smoking-Related Behaviour and Attitudes, 2008/09.” Opinions Survey Report 40 . Lam D. ( 1988). “ Marriage Markets and Assortative Mating with Household Public Goods: Theoretical Results and Empirical Implications”. Journal of Human Resources , 23( 4), 462– 487. Google Scholar CrossRef Search ADS   Lundberg S. ( 2012). “ Personality and Marital Surplus.” IZA Journal of Labor Economics , 1. Madrian B., Lefgren L. ( 1999) “ A Note on Longitudinally Matching CPS Respondents.” NBER Technical Working Paper #247.  NBER, Cambridge, MA. Mannino D., Siegel M., Rose D., Nkuchia J., Etzel R. ( 1997). “ Environmental Tobacco Smoke Exposure in the Home and Worksite and Health Effects in Adults: Results from the 1991 National Health Interview Survey”. Tobacco Control , 6, 296– 305. Google Scholar CrossRef Search ADS PubMed  Maralani V. ( 2009). “ An Unequal Start: The Alignment of Education and Smoking in Families of Origin.” Working paper , Yale University. Mare R. ( 2008). “ Educational Assortative Mating in Two Generations.” Working paper , University of California at Los Angeles. McCann R., Guillen N. ( 2010). “ Five Lectures on Optimal Transportation: Geometry, Regularity and Applications.” University of Toronto. Google Scholar CrossRef Search ADS   Mills A., Messer K., Gilpin L., Pierce G. ( 2009). “ The Effect of Smoke-Free Homes on Adult Smoking Behavior: A Review”. Nicotine Tobacco Research , 11, 1131– 1141. Google Scholar CrossRef Search ADS PubMed  Monge G. ( 1781). “ Mémoire sur la théorie des déblais et de remblais.” Histoire de l’Académie Royale des Sciences de Paris, avec les Mémoires de Mathématique et de Physique pour la même année . De l'Imprimerie Royale, Paris, pp. 666– 704. NCHS. ( 2010). National Center for Health Statistics. Health, United States, 2009: With Special Feature on Medical Technology . Hyattsville, MD. Oreffice S., Quintana-Domeque C. ( 2010). “ Anthropometry and Socioeconomics Among Couples: Evidence in the United States”. Economics and Human Biology , 8( 3), 373– 384. Google Scholar CrossRef Search ADS PubMed  Pencavel J. ( 1998). “ Assortative Mating by Schooling and the Work Behavior of Wives and Husbands”. American Economic Review Papers and Proceedings, 88, 326– 329. Pilkington P., Gray S., Gilmore A., Daykin N. ( 2006). “ Attitudes Towards Second Hand Smoke amongst a Highly Exposed Workforce: Survey of London Casino Workers”. Journal of Public Health , 28, 104– 110. Google Scholar CrossRef Search ADS PubMed  Qian Z. ( 1998). “ Changes in Assortative Mating: The Impact of Age and Education, 1970–1990” Demography , 35, 279– 292. Google Scholar CrossRef Search ADS PubMed  Shapley L.S., Shubik M. ( 1971). “ The Assignment Game I: The Core”. International Journal of Game Theory , 1, 111– 130. Google Scholar CrossRef Search ADS   Shimer R., Smith L. ( 2000). “ Assortative Matching and Search,” Econometrica , 68, 343– 369. Google Scholar CrossRef Search ADS   Silventoinen K., Kaprio J., Lahelma E., Viken R.J., Rose R.J. ( 2003). “ Assortative Mating by Body Height and BMI; Finnish Twins and their Spouses”. American Journal of Human Biology , 15, 620– 627. Google Scholar CrossRef Search ADS PubMed  Surgeon General’s Report ( 2001). “ Women and Smoking: A Report from the Surgeon General.” US Department of Health and Human Services . Surgeon General’s Report ( 2012). “ Preventing Tobacco Use Among Youth and Young Adults.” US Department of Health and Human Services . Sutton G. ( 1980). “ Assortative Marriage for Smoking Habits”. Annals of Human Biology , 7, 449– 456. Google Scholar CrossRef Search ADS PubMed  Sutton G. ( 1993). “ Do Men Grow to Resemble their Wives or Vice Versa?” Journal of Biological Science , 25, 25– 29. Venters M., Jacobs D., Luepker R., Maimaw L., Gillum R. ( 1984). “ Spouse Concordance of Smoking Patterns: The Minnesota Heart Survey”. American Journal of Epidemiology , 120, 608– 616. Google Scholar CrossRef Search ADS PubMed  Villani C. ( 2003). “ Topics in Optimal Transportation.” Graduate Studies in Mathematics , Vol. 58. American Mathematical Society, Providence, RI. Google Scholar CrossRef Search ADS   Weiss Y., Willis R. ( 1997). “ Match Quality, New Information, and Marital Dissolution”. Journal of Labor Economics , 15, 293– 329. Google Scholar CrossRef Search ADS   Wooldridge J. ( 2002). Econometric Analysis of Cross Section and Panel Data . MIT Press, Cambridge, MA. Öberg M., Jaakkola M., Woodward A., Peruga A., Prüss-Ustün A. ( 2011). “ Worldwide Burden of Disease from Exposure to Second-Hand Smoke: A Retrospective Analysis of Data from 192 Countries”. Lancet , 377, 139– 146. Google Scholar CrossRef Search ADS PubMed  © The Authors 2017. Published by Oxford University Press on behalf of European Economic Association. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

### Journal

Journal of the European Economic AssociationOxford University Press

Published: Feb 1, 2018

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
that matters to you.

over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month ### Explore the DeepDyve Library ### Search Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly ### Organize Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place. ### Access Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals. ### Your journals are on DeepDyve Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more. All the latest content is available, no embargo periods. DeepDyve ### Freelancer DeepDyve ### Pro Price FREE$49/month
\$360/year

Save searches from
PubMed

Create lists to

Export lists, citations