Immigration, Wages, and Education: A Labour Market Equilibrium Structural Model

Immigration, Wages, and Education: A Labour Market Equilibrium Structural Model
Llull, Joan
2017-09-18 00:00:00
Abstract Recent literature analysing wage effects of immigration assumes labour supply is fixed across education-experience cells. This article departs from this assumption estimating a labour market equilibrium dynamic discrete choice model on U.S. micro-data for 1967–2007. Individuals adjust to immigration by changing education, participation, and/or occupation. Adjustments are heterogeneous: 4.2–26.2% of prime-aged native males change their careers; of them, some switch to white-collar careers and increase education by about three years; others reduce labour market attachment and reduce education also by about three years. These adjustments mitigate initial effects on wages and inequality. Natives that are more similar to immigrants are the most affected on impact, but also have a larger margin to adjust and differentiate. Adjustments also produce a self-selection bias in the estimation of wage effects at the lower tail of the distribution, which the model corrects. 1. Introduction During the last forty years, 26 million immigrants of working-age entered the U.S. These immigrants have different skills and work in different occupations than natives, and they changed the composition of the workforce. This change may have affected the skill premium. How do human capital and labour supply decisions react to immigration? Would U.S. natives have spent fewer years in school without the massive inflow of foreign workers? Would they have participated more in the labour market? Would they have specialized in different occupations? Providing answers to these questions is crucial to understand the economic consequences of this massive inflow of foreign workers. Whether and to what extent immigration affected labour market opportunities of native workers has concerned economists and policy makers for years. After an initial strand of the literature exploiting regional differences in immigration, more recent work used cross-skill variation at the national level to identify the effect of immigration on wages.1 Such analysis considers education-experience cells at a point in time as closed labour markets that are differently penetrated by immigrants. As noted by Borjas (2003, p. 1337) “the size of the native workforce in each of the skill groups is relatively fixed, so that there is less potential for native flows to contaminate the comparison of outcomes across skill groups”. This assumption is present in other papers in the literature (see Card (2009), Ottaviano and Peri (2012), and Llull (2017c) among many others, and Borjas (1999) and Card (2009) for surveys). Even though this cross-skill cell comparison has not brought a consensus on what the effect of immigration on average wages is (which is sensitive to assumptions on elasticities of substitution and on how capital reacts to immigration) most of the papers agree on the existence of asymmetric effects across different workers. As a result, the common assumption of fixed labour supply is not innocuous. Asymmetric effects across different workers change relative wages, and thus generate incentives to adjust human capital and labour supply decisions. Failing to account for these adjustments may lead to a substantial bias in the estimation of wage effects of immigration. Negative effects of immigration on wage levels and inequality would be overstated. In this article, I propose and estimate an equilibrium dynamic discrete choice structural model of a labour market with immigration. The model, estimated with U.S. micro-data, is used to identify wage effects of immigration over the last four decades, taking into account labour supply and human capital adjustments by natives and previous generations of immigrants. This approach allows me to address three main points. First, I can quantify and correct the biases in the estimated effects of immigration on wages and on wage inequality introduced by ignoring labour market adjustments to immigration. Second, the dynamic model allows me to identify non-trivial heterogeneous adjustments in education that could not be identified otherwise. And third, I find that labour market detachment produces an additional self-selection bias in the estimation of wage effects of immigration along the native wage distribution, which can be corrected with the model. The equilibrium framework builds on Altuğ and Miller (1998), Heckman et al. (1998), Lee (2005), and Lee and Wolpin (2006, 2010). The supply side of the model extends the structure of Keane and Wolpin (1994, 1997) to accommodate immigrant and native workers separately. Individuals live from age 16 to 65 years and make yearly forward-looking decisions on education, participation and occupation. Immigrants make these decisions as well when they are in the U.S. For these immigrants, the model is able to replicate two empirical regularities established in the literature: immigrants downgrade upon entry, that is, they earn lower wages than observationally equivalent natives (Dustmann et al., 2013); and they assimilate, since between two observationally equivalent immigrants, the one with greater time in the U.S. earns more (LaLonde and Topel, 1992). Human capital accumulates throughout the life-cycle both because of investments in education, and because learning-by-doing on the job leads to accumulation of occupation-specific work experience. In their human capital investment decisions, individuals make forecasts about future wages, which depend on future immigration. Individuals are rational in that they make the best possible forecast given the available information (current labour market conditions and the process describing aggregate uncertainty), but they are unable to perfectly foresee future immigration waves and wages. On the demand side, blue-collar and white-collar labour is combined with capital to produce a single output. Labor is defined in skill units, which implies that workers have heterogeneous productivity depending on their education, occupation-specific experience, national origin, gender, foreign experience, and unobservables. This approach flexibly allows for a continuum of possibilities of imperfect substitution between immigrant and native workers in production. I assume a nested Constant Elasticity of Substitution (CES) production function that accounts for skill-biased technical change through capital-skill complementarity (as in Krusell et al., 2000). This is important to correctly estimate native responses to immigration, because it competes with immigration as a source of the increase in wage inequality over the last decades. The equilibrium framework is a crucial feature of the model because it links the immigration-induced supply shift with the changes in incentives for natives through changes in relative wages. I fit the model to U.S. micro-data data from the Current Population Survey (CPS) and the National Longitudinal Survey of Youth (NLSY) for the period 1967–2007. I then use the estimated parameters to quantify the effect of immigration on labour market outcomes. To do so, I define a counterfactual world without large-scale immigration in which the immigrant/native ratio is kept constant to 1967 levels. Then, I compare counterfactual wages, human capital, and labour supply with baseline simulations obtained with the estimated parameters. When I do not allow natives to adjust their human capital and labour supply decisions, results for wages are very much in line with existing papers in the literature, both qualitatively and quantitatively. Overall estimated effects are negative if physical capital does not react to immigration, and virtually zero if capital fully adjusts. Also, the most important effects are on redistribution: less educated, younger, and male individuals are more affected than highly educated, older, and female. When natives and previous generations of immigrants are allowed to respond to immigration by changing their labour supply and human capital decisions, results change in a non-trivial way. Negative effects on wages are mitigated, and redistribution effects are partially arbitraged out, and even reverted in some cases. For example, if capital is not allowed to react to immigration, wages of young male with high school education or less are reduced, on average, by a $$4.7$$% on impact, and those of old college educated female are reduced by a $$4$$%; after human capital and labour supply adjustments take place, wage effects on the former move down to $$2.5$$%, whereas effects on the latter only go down to $$3.6$$%. This is because, even though immigrants are more similar to less educated young men than old educated female, the former have a much larger margin of adjustment (they can increase education, switch occupations, and so on). All this suggests that biases in the estimation of wage effects of immigration are large and ambiguous when labour supply is assumed to be fixed. In the model, individuals have three adjustment mechanisms: education, occupation, and participation. Results suggest a significant heterogeneity across individuals in optimal reactions to immigration. Between $$1.4$$% and $$12.4$$% of males adjust their education, depending on the assumed counterfactual evolution of capital. Likewise, among those that would work in blue-collar jobs in a given cross-section, $$1.4$$% to $$4.9$$% switch to white-collar jobs, and $$0.6$$% to $$3.4$$% decide not to work. Overall, $$4.2$$% to $$26.2$$% of the workforce adjust their career paths. Regarding adjustments on education, the dynamic nature of the model allows me to identify non-trivial heterogeneous responses that have not been identified before in the literature. Some individuals become more likely to pursue a white-collar career and, since education is more rewarded there, they extend their stay in school by an average of $$3.1$$ –$$3.2$$ years.2 Others, however, become more detached from the labour market and, given the lower return to their investment, they drop out from school earlier, reducing their education by an average of $$2.9$$ years. Which of the two effects dominates is an empirical question, and crucially depends on the assumed counterfactual evolution of capital. When capital fully adjusts, the first channel prevails; the opposite is true if capital does not react to immigration. Occupation adjustments are also important. As noted above, $$1.4$$% to $$4.9$$% of native male that would work in a blue-collar job in a given cross-section switch to a white-collar job. This is so because individuals specialize to differentiate from immigrants, and they have comparative advantage in white-collar jobs. Finally, the adjustments on participation introduce an additional dimension. Immigration-induced labour market detachment is not randomly distributed over the workforce. Least productive individuals are more likely to drop out from the market as a consequence of immigration. As a result, the comparison of realized wages in baseline and counterfactual scenarios delivers a biased estimate of wage effects of immigration, because of the resulting compositional change (a similar argument to the standard selectivity bias described in Gronau (1974) and Heckman (1979)). The bias is expected to be more severe at the bottom tail of the wage distribution. The structure of the model allows me to identify and correct this bias in a natural way. Potential wages for individuals that decide not to work can be simulated with the estimated model. This allows me to compute the effect of immigration on potential wages along the native distribution comparing the same set of individuals with and without immigration, which avoids the compositional changes that generate the selectivity bias. The comparison of wage effects along the distributions of potential and realized wages allows me to quantify the size of the bias. Results reveal that the bias is only apparent below the median of the native wage distribution, and it is much larger in the scenario without capital adjustment. The size of the bias increases towards the bottom tail. At the 5th percentile, the negative effect of immigration on potential wages of prime-aged native male is 20%–46% larger than the estimated effect on realized wages, depending on the assumption about counterfactual capital, and 235%–275% larger in the case of female. This article contributes to the extensive literature on labour market effects of immigration. It highlights the importance of relaxing the assumption of fixed labour supply in models like those in Borjas (2003), Card (2009), Ottaviano and Peri (2012), or Llull (2017c). As in these papers, the size of average wage effects depends mostly on the assumption about the counterfactual evolution of capital, and the effects on relative wages are more important than on average wages. However, it contributes by showing that estimates differ substantially depending on whether labour market equilibrium adjustments are accounted for or not. It also departs from these papers in three additional dimensions. The first one is that the production function used here allows for capital-skill complementarity and skill-biased technical change. Lewis (2011) highlights the importance of capital-skill complementarity when analysing the effect of immigration on wages across local labour markets. Heckman et al. (1998) explore immigration and skill-biased technical change as competing sources for the increase in wage inequality. In the present equilibrium framework, it is very important to include them to avoid biases in the estimation of labour supply responses to immigration. The second one is that it introduces the occupational dimension, which gives micro-structure to the imperfect substitution between natives and immigrants discussed in Ottaviano and Peri (2012) and Manacorda et al. (2012)—specialization was already hinted as a potential source for this imperfect substitution in Peri and Sparber (2009) and Ottaviano and Peri (2012). In my model, the extent to which observationally equivalent immigrants and natives are imperfect substitutes in production is endogenously determined by the different choices they make in equilibrium. And the third one is that it departs from the classification of skills in terms of education-experience cells. Dustmann et al. (2013) introduced a more flexible definition of skills: the position along the native wage distribution. The measurement of aggregate labour supply in this article has a similar flavour. Another strand of the literature estimates the wage effect of immigration comparing wages across different local labour markets. In that literature, Card (2001), Borjas (2006), and more recently Piyapromdee (2015), model spatial equilibrium responses to immigration in a static framework.3Card (2001) additionally introduces occupations and participation to determine how natives and immigrants are competing in the same location. The role of internal migration in these papers is analogous to the one of human capital and labour supply adjustments here: it is the mechanism by which wage effects of immigration in some labour markets are arbitraged out towards the (initially unaffected) others in equilibrium. Piyapromdee (2015) simulates the economy with and without equilibrium adjustments, analogously to what I do below, and also finds that the effects on impact are initially negative, and then they are mitigated by equilibrium adjustments. A few recent papers use the spatial approach to estimate the effect of immigration on related outcomes like schooling, task specialization, and employment. Hunt (2016) finds that native children, especially native blacks, increase their high school completion rates to avoid subsequent competition by unskilled immigrants in the labour market. Peri and Sparber (2009) provide evidence that native individuals specialize in language intensive occupations as immigrants have comparative advantage in manual intensive tasks. And Smith (2012) finds that immigration of low educated workers led to a substantial reduction in employment, particularly severe for native youth. The rest of the article is organized as follows: Section 2 provides some descriptive facts about U.S. immigration; Section 3 presents the labour market equilibrium structural model with immigration; Section 4 presents the data and discusses identification, solution, and estimation of the model; in Section 5, I present parameter estimates and some exercises that evaluate the goodness of fit of the model; Section 6 presents the results from the counterfactual simulations, which quantify the labour market effects of immigration. And Section 7 concludes. 2. Exploring U.S. mass immigration According to Census data, the U.S. labour force was enlarged by about 26 millions of working-age immigrants during the last four decades, an increase of almost 0.7 millions per year. This section aims to compare the evolution of the skill composition of immigrants with that of natives, and to establish some correlations between immigration, schooling, and occupational choice. These facts serve as a motivation for the modelling decisions taken in subsequent sections. Table 1 presents the evolution of the share of immigrants in different subpopulations of the workforce from 1970 to 2008. The share of immigrants among working-age individuals increased from 5.7% to 16.6%. The skill and occupational composition also changed substantially. The share of immigrants among the least educated increased faster than for any other group (6.8%–33.7%). And immigrants are increasingly more clustered in blue-collar jobs: the share of immigrants among blue-collar workers increased from 6% to 24% (much more than the overall increase from 5.7% to 16.6%) and, conditional on education, the share of immigrants among dropout blue-collar workers increased from 7.2% to 55.5% (compared to the overall increase from 6.8% to 33.7%). In sum, immigrants are increasingly less educated than natives, and they tend to cluster more in blue-collar jobs, even conditional on education. TABLE 1 Share of immigrants in the population (%) 1970 1980 1990 2000 2008 A. Working-age population 5.70 7.13 10.27 14.62 16.56 B. By education High school dropouts 6.84 9.60 17.93 29.02 33.73 High school graduates 4.32 5.14 7.94 12.04 13.27 Some college 5.14 6.63 7.92 9.96 11.65 College graduates 6.48 8.02 10.60 14.59 16.92 1970 1980 1990 2000 2008 A. Working-age population 5.70 7.13 10.27 14.62 16.56 B. By education High school dropouts 6.84 9.60 17.93 29.02 33.73 High school graduates 4.32 5.14 7.94 12.04 13.27 Some college 5.14 6.63 7.92 9.96 11.65 College graduates 6.48 8.02 10.60 14.59 16.92 C. In blue-collar jobs All education levels 6.03 7.83 11.21 17.53 24.07 High school dropouts 7.18 12.18 23.75 41.03 55.45 High school graduates 4.19 4.94 7.57 12.47 17.30 Some college 5.95 6.14 7.26 9.82 14.07 College graduates 9.53 9.52 12.14 17.89 23.82 C. In blue-collar jobs All education levels 6.03 7.83 11.21 17.53 24.07 High school dropouts 7.18 12.18 23.75 41.03 55.45 High school graduates 4.19 4.94 7.57 12.47 17.30 Some college 5.95 6.14 7.26 9.82 14.07 College graduates 9.53 9.52 12.14 17.89 23.82 Figures in each panel indicate respectively the percentage of immigrants in the population working-age, in the pool of individuals with each educational level, and among blue-collar workers. Sources: Census data (1970–2000) and ACS (2008). TABLE 1 Share of immigrants in the population (%) 1970 1980 1990 2000 2008 A. Working-age population 5.70 7.13 10.27 14.62 16.56 B. By education High school dropouts 6.84 9.60 17.93 29.02 33.73 High school graduates 4.32 5.14 7.94 12.04 13.27 Some college 5.14 6.63 7.92 9.96 11.65 College graduates 6.48 8.02 10.60 14.59 16.92 1970 1980 1990 2000 2008 A. Working-age population 5.70 7.13 10.27 14.62 16.56 B. By education High school dropouts 6.84 9.60 17.93 29.02 33.73 High school graduates 4.32 5.14 7.94 12.04 13.27 Some college 5.14 6.63 7.92 9.96 11.65 College graduates 6.48 8.02 10.60 14.59 16.92 C. In blue-collar jobs All education levels 6.03 7.83 11.21 17.53 24.07 High school dropouts 7.18 12.18 23.75 41.03 55.45 High school graduates 4.19 4.94 7.57 12.47 17.30 Some college 5.95 6.14 7.26 9.82 14.07 College graduates 9.53 9.52 12.14 17.89 23.82 C. In blue-collar jobs All education levels 6.03 7.83 11.21 17.53 24.07 High school dropouts 7.18 12.18 23.75 41.03 55.45 High school graduates 4.19 4.94 7.57 12.47 17.30 Some college 5.95 6.14 7.26 9.82 14.07 College graduates 9.53 9.52 12.14 17.89 23.82 Figures in each panel indicate respectively the percentage of immigrants in the population working-age, in the pool of individuals with each educational level, and among blue-collar workers. Sources: Census data (1970–2000) and ACS (2008). Further exploration of these facts—available at the Online Appendix (Llull, 2017a)—show three additional conclusions. First, the decrease in the relative education of immigrants compared to natives is due to their slower increase educational attainment, not to a decrease in absolute terms. Second, most of it can be explained by the change in the national origin composition of immigrants. And, third, the increasing clustering in blue-collar occupations occurred for all two-digit occupations included in the group. In particular, the share of immigrants among farm labourers, among labourers, among service workers, among operatives, and among craftsmen (blue-collar) increased more rapidly than among professionals, among managers, among clerical and kindred, among sales workers, and among farm managers (white-collar). This suggests that the blue-collar/white-collar classification used in the model below captures reasonably well the differential increase in labour market competition introduced by the new immigrant inflows. Borjas (2003, Secs. II–VI) compares immigration and wages in different skill cells, defined by education and (potential) experience. He considers four education groups and eight experience categories, defining cells that are then treated as closed labour markets. As immigration varies across skill groups, he uses this variation to identify the effect of immigration on wages in regressions that include different combinations of fixed effects. With this approach, he finds a sizable negative correlation between immigration and wages. I replicate his results using 1960–2000 Censuses and 2008 ACS in Panel A from Figure 1. The figure shows that the correlation between the share of immigrants and the average wage of native males in a cell (net of fixed effects) is negative. In particular, a one percentage point increase in the share of immigrants is associated with a 0.41 (standard error (s.e.) 0.044) percent decrease in average hourly wages. FIGURE 1 View largeDownload slide The correlation of immigration with wages, school enrollment, and occupational choice Notes: (A) Each point relates average log hourly wage and immigrant share in a given education-experience-year cell. Immigrant shares and average wages are computed for full time workers (20$$+$$ hours per week, 40$$+$$ weeks per year) aged 16–65 years. Both wages and immigrant shares are net of education, experience, and period fixed effects. The line shows the fitted regression line, with an estimated slope of $$-0.405$$ (0.044). (B) Each point relates the enrollment rate of individuals with a given completed level of education in a given year and the share of immigrants in that education-year cell. Immigrant shares and enrollment rates are computed with a sample of individuals aged 16–35 years. Both enrollment rates and immigrant shares are net of education and period fixed effects. The fitted regression line has an estimated slope of 0.458 (0.125). (C) Each point relates the fraction of individuals working in blue-collar that transit to white-collar in the next year and the immigrant share in a given education-experience-year cell. Immigrant shares and transition probabilities are computed with a sample of full time workers aged 16–65 years. Both transition probabilities and immigrant shares are net of education, experience, and period fixed effects. The fitted regression line has an estimated slope of 0.153 (0.045). General Notes: Education: high school dropouts, high school graduates, some college, and college graduates; potential experience (age minus education): 9 five-year groups; years: 1960, 1970, 1980, 1990, 2000, and 2008. Sources: Census, ACS, and matched March supplements of CPS. FIGURE 1 View largeDownload slide The correlation of immigration with wages, school enrollment, and occupational choice Notes: (A) Each point relates average log hourly wage and immigrant share in a given education-experience-year cell. Immigrant shares and average wages are computed for full time workers (20$$+$$ hours per week, 40$$+$$ weeks per year) aged 16–65 years. Both wages and immigrant shares are net of education, experience, and period fixed effects. The line shows the fitted regression line, with an estimated slope of $$-0.405$$ (0.044). (B) Each point relates the enrollment rate of individuals with a given completed level of education in a given year and the share of immigrants in that education-year cell. Immigrant shares and enrollment rates are computed with a sample of individuals aged 16–35 years. Both enrollment rates and immigrant shares are net of education and period fixed effects. The fitted regression line has an estimated slope of 0.458 (0.125). (C) Each point relates the fraction of individuals working in blue-collar that transit to white-collar in the next year and the immigrant share in a given education-experience-year cell. Immigrant shares and transition probabilities are computed with a sample of full time workers aged 16–65 years. Both transition probabilities and immigrant shares are net of education, experience, and period fixed effects. The fitted regression line has an estimated slope of 0.153 (0.045). General Notes: Education: high school dropouts, high school graduates, some college, and college graduates; potential experience (age minus education): 9 five-year groups; years: 1960, 1970, 1980, 1990, 2000, and 2008. Sources: Census, ACS, and matched March supplements of CPS. Given the research question of this article, it is worthwhile to look at the correlation between immigration and education. Figure 1B compares school enrollment rates and immigrant shares, following an analogous approach to the one described for wages. In particular, I correlate the share of immigrants in a particular education group with enrollment rates of individuals aged 16–35 years who exactly achieved that educational level (net of education and time fixed effects). The intuition behind this exercise is as follows: an individual who has just completed, say, high school, will decide whether to enroll for one additional year or not depending on how tough the labour market competition for high school graduates is. The figure suggests a positive correlation. Specifically, a one percentage point increase in the share of immigrants in a particular group is associated with a 0.46 (s.e. 0.125) points increase in the enrollment rate at that educational level. Older natives or those who already left education are less likely to go back to school to differentiate themselves from immigrants. A more natural mechanism for them is switching occupations. Figure 1C is suggestive of the extent to which this is observed in the data. In this graph, immigrant shares in education-experience cells are related to one year blue-collar to white-collar transition probabilities in an analogous way to Figure 1A. The fitted regression suggests that a percentage point increase in the share of immigrants in a cell is associated with a 0.15 (s.e. 0.045) percentage points increase in the one year blue-collar to white-collar transition probability. This effect is sizable, as it suggests that the increase in immigration of the last decades would explain more than a 10% of the observed increase in blue-collar to white-collar transitions. The result is indicative of the importance of taking occupational choice into account in the analysis. The correlations presented in Figure 1B and C are suggestive of labour market adjustments to immigration in terms of human capital and labour supply. Career paths and human capital investments are forward-looking decisions that are difficult to assess through reduced form approaches. For this reason, the model below describes the behaviour of forward-looking agents making such decisions, within an equilibrium framework that links immigration and labour supply decisions of natives and previous immigrants through changes in relative wages. 3. A labour market equilibrium model with immigration In this section, I present a labour market equilibrium model with immigration. The model, estimated with U.S. data, is used to quantify the effect of the last four decades of immigration on wages, accounting for human capital and labour supply adjustments by natives and previous generations of immigrants. This approach departs from the literature in that it models the labour supply and human capital decisions explicitly, instead of assuming that labour supply is fixed. It also takes into account skill-biased technical change (considered as an alternative hypothesis for the increase in wage dispersion in the U.S. in recent decades). 3.1. Career decisions and the labour supply Native individuals enter in the model at age $$a=16$$, and immigrants do so upon arrival in the U.S. Both natives and immigrants make yearly decisions until the age of 65 years when they die with certainty.4 Each year, they choose among four mutually exclusive alternatives to maximize their lifetime expected utility. The alternatives are: to work in a blue-collar job, $$d_a=B$$; or in a white-collar job, $$d_a=W$$; to attend school, $$d_a=S$$; or to stay at home, $$d_a=H$$. The decision to migrate to the U.S. is specified outside of the model. Identifying individual migration decisions requires observing immigrants in their home country and in the U.S., and additional micro data on stayers in all countries of origin. There are no data sets that I am aware of that contain all this information. And, even if there were, the extension of the model in this direction would be computationally unfeasible (see Kennan and Walker, 2011; Llull and Miller, 2016). However, in estimation I allow the total inflow of immigrants and the distribution of characteristics with which they enter into the U.S. to endogenously adjust to U.S. aggregate conditions (aggregate productivity shocks, wage rates, labour supply, and so on).5 This is so because no orthogonality condition is placed between aggregate quantities and the aggregate shock, as discussed below. There are $$L$$ types of individuals that differ in skill endowments and preferences. These types are defined based on national origin and gender. I assume $$L=8$$ (male and female for four regions of origin: the U.S., Western countries, Latin America, and Asia/Africa). National origin is important for three reasons. First, as noted in Section 2, the evolution of the national origin composition of immigration can explain most of the evolution in the educational level of immigrants over the last decades. Second, there are important differences in wages, distribution of occupations, and labour market participation among immigrants coming from these different origins. And, third, these regions differ substantially in the incidence of illegal migration to the U.S. Individual types are, hence, based on observable characteristics. Introducing permanent or persistent unobserved heterogeneity is unfeasible for two reasons. The first one is computational tractability. The second is identification. In particular, since the decision to migrate would be endogenous to these unobservables (Borjas, 1987), identification of their distribution would require modelling individual migration decisions, which is not feasible as discussed above. At every point in time $$t$$, an individual $$i$$ of type $$l$$ and age $$a$$ solves the following dynamic programming problem: \begin{equation} V_{a,t,l}(\Omega_{a,t})=\max_{d_a}U_{a,l}(\Omega_{a,t},d_a)+\beta \mathbb{E}\left[V_{a+1,t+1,l}(\Omega_{a+1,t+1})\mid\Omega_{a,t},d_a,l\right], \label{eq:individual-problem} \end{equation} (3.1) where $$\mathbb{E}[.]$$ indicates expectation, $$\beta$$ is the subjective discount factor, and $$\Omega_a$$ is the state vector.6 The list of variables included in $$\Omega_a$$, as well as the way in which individuals form expectations about future $$\Omega$$ is discussed in Section 3.3. The terminal value is $${V_{65+1,t,l}=0\ \forall l,t}$$. The instantaneous utility function is choice-specific, $$U_{a,l}(\Omega_{a,t},d_a=j)\equiv U_{a,t,l}^j$$ for $${j=B,W,S,H}$$. Workers have a linear utility. They are not allowed to save and, hence, they are not able to smooth consumption. Working utilities are given by: \begin{equation} U^j_{a,t,l}= w^j_{a,t,l}+\delta^{BW}_g{\rm 1}\kern-0.24em{\rm I} \{d_{a-1}=H\}\quad j=B,W,\label{eq:work-utility} \end{equation} (3.2) where $$w^j_{a,t,l}$$ are individual wages in occupation $$j=B,W$$, $${\rm 1}\kern-0.24em{\rm I}\{A\}$$ is an indicator function that takes the value of one if condition $$A$$ is satisfied and zero otherwise, and $$\delta^{BW}_g$$ is a gender-specific labour market reentry cost that workers pay to get a job if they were not working (and not in school) in the previous period.7 Workers are paid their marginal product. A base rate $$r_t^j$$ is determined in equilibrium, as discussed below. Individual wages are scaled by the relative productivity of the individual, through a factor $$s^j_{a,l}$$ that depends on individual characteristics and independent and identically distributed idiosyncratic shocks. Hence, wages $$w^j_{t,a,l} \equiv r^j_t \times s^j_{a,l}$$ are defined by a fairly standard Mincer equation (Mincer, 1974): \begin{equation} w^j_{a,t,l}= r^j_t \exp\{\omega^j_{0,l} + \omega^j_{1,is}E_a + \omega^j_2X_{Ba}+\omega^j_3X_{Ba}^2+\omega^j_4X_{Wa}+\omega^j_5X_{Wa}^2+ \omega^j_6X_{Fa}+ \varepsilon^j_a\},\label{eq:wages} \end{equation} (3.3) where: \begin{equation*} \begin{pmatrix}\varepsilon^B_a\\[6pt]\varepsilon_a^W\end{pmatrix}\sim i.i.\mathcal{N}\left(\begin{bmatrix}0\\[6pt]0\end{bmatrix},\begin{bmatrix}\sigma_{Bg}^2&\rho_{BW}\sigma_{Bg}\sigma_{Wg}\\[6pt]\rho_{BW}\sigma_{Bg}\sigma_{Wg}&\sigma_{Wg}^2\end{bmatrix}\right). \end{equation*} The exponential part of Equation (3.3) defines an expression for $$s^j_{a,l}$$ (individual skill units), as a function of individual type $$l$$, education $$E$$ (with $$is=nat,immig$$), blue-collar and white-collar domestic experience in the country $$X_B$$ and $$X_W$$, potential experience abroad $$X_F$$ (age at entry minus education), and a random shock, $$\varepsilon^j$$, with gender-specific variance $$\sigma^2_{jg}$$ and (gender-invariant) correlation across occupations $$\rho_{BW}$$. Idiosyncratic shocks are assumed to be independently and identically distributed across individuals, and uncorrelated with individual and aggregate characteristics. When working in occupation $$j$$, individuals accumulate one additional year of occupation $$j$$-specific experience, $$X_{ja+1}=X_{ja}+{\rm 1}\kern-0.24em{\rm I}\{d_a=j\}$$, which has a return in the future. Wages have been modelled extensively in the literature using Mincer equations (see Heckman et al. (2006) for a review). These have been proved to fit life-cycle earnings profiles reasonably well. The Mincer equation approximates the framework of human capital accumulation on the job in Ben-Porath (1967). As noted by Heckman et al. (2006, p. 317), the Mincer equation is consistent with a linearly declining rate of investment on-the-job, which implies that the log-wage is a quadratic function of experience. Formal education is introduced in the model as a special case in which all available time is devoted to skill accumulation. For this reason, education enters linearly in (the log of) Equation (3.3). Equation (3.3) also accounts for different rates of human capital accumulation in the different occupations, introducing separate returns to the experience obtained in each. Observationally equivalent natives and immigrants supply different amounts of skills for several reasons. First, they have different intercepts $$\omega^j_{0,l}$$, which capture non-fully-portable region-specific skills ($$e.g.$$ language and culture), and also regional differences in other dimensions, like the prevalence of illegal immigration. Second, their returns to education, $$\omega^j_{1,is}$$, may differ, because immigrants may undertake (at least a part of) their education abroad. As schooling abroad is not necessarily oriented towards the U.S. labour market, foreign education could map into lower wages compared to the education obtained in the U.S. ($$e.g.$$ learning Chinese calligraphy versus English grammar).8 And third, while abroad, immigrants accumulate foreign instead of domestic experience, which potentially have different returns. These differences generate a good fit of choices and wages of immigrants in the data, which is important to correctly quantify the magnitude of the immigration shock. Moreover, it is also important because they generate two important regularities established in the literature. The first one is downgrading of immigrants upon arrival in the U.S. Dustmann et al. (2013) define downgrading as the situation in which the position occupied by immigrants along the native wage distribution is below the one they would occupy based on observables. The second regularity is immigrant assimilation. LaLonde and Topel (1992) define assimilation as the process whereby, between two observationally equivalent immigrants, the one with greater time in the U.S. earns more. According to this definition, immigrants assimilate as they accumulate some skills in the U.S. that they would not have accumulated in their home country (Borjas, 1999), which in the model is generated by a different (larger) return to domestic compared to foreign experience.9 Individuals who decide to attend school face a monetary cost, which is different for undergraduate ($$\tau_1$$), and graduate students ($$\tau_1+\tau_2$$). Additionally, they get a non-pecuniary utility with a permanent component $$\delta^S_{0,l}$$, a disutility of going back to school if they were not in school in previous period $$\delta_{1,g}^S$$, and an i.i.d. transitory shock $$\varepsilon^S_a$$, normally distributed with gender-specific variance $$\sigma^2_{Sg}$$: \begin{equation} U^S_{a,l}=\delta^S_{0,l} -\delta_{1,g}^S{\rm 1}\kern-0.24em{\rm I}\{d_{a-1} \neq S\}-\tau_1{\rm 1}\kern-0.24em{\rm I}\{E_a \ge 12\}-\tau_2{\rm 1}\kern-0.24em{\rm I}\{E_a \ge 16\}+\varepsilon^S_a. \label{eq:school-utility} \end{equation} (3.4) As a counterpart, they increase their education, $$E_{a+1}=E_a+{\rm 1}\kern-0.24em{\rm I}\{d_a=S\}$$, which provides a return in the future. Finally, individuals remaining at home enjoy non-pecuniary utility given by: \begin{equation} U^H_{a,t,l}=\delta^H_{0,l}+\delta^H_{1,g}n_a+\delta^H_{2,g}t+\varepsilon^H_a. \label{eq:home-utility} \end{equation} (3.5) In this case, on top of its permanent and transitory components $$\delta^H_{0,l}$$ and $$\varepsilon^H_a$$ (normally distributed with gender-specific variance $$\sigma^2_{Hg}$$), the utility is increased by a gender-specific amount $$\delta^H_{1,g}$$ for each preschool children living at home, $$n_a$$.10 Finally, the home utility includes gender-specific trend $$\delta^H_{2,g}t$$. The linear utility assumption implies no income effect in the labour force participation decision, and, hence, participation is driven only by the substitution effect. In a framework with growing wages, everyone would eventually work in the long run. A linear trend in the home utility is a reduced form way of avoiding this problem. To sum up, the main trade-offs that define the labour supply problem are as follows. Individuals decide whether to enjoy home utility, invest in education, or work in one of the two available occupations. Even though one can enjoy it ($$\delta^S_{0,l}+\varepsilon^S_a$$ can be positive), attending school typically entails a contemporaneous cost. In return, education provides higher wages in the future. Since returns differ across occupations, the education decision is an important determinant of future career path, and the expected future path will also influence the decision to obtain education. When working, individuals are paid a wage, and they accumulate work experience, which maps into future wages. Forward-looking individuals could be interested in an occupation that pays a lower contemporaneous wage if the experience provides a high enough return in the future. Occupation decisions, hence, affect and are affected by future career prospects. Finally, the wage rate $$r^j_t$$ is an equilibrium outcome. It channels the effect of immigration towards native choices and wages. If immigrants have comparative advantage in blue-collar jobs, immigration may put negative pressure on the blue-collar rate, which generates incentives for natives to switch to white-collar careers. Likewise, it may also reduce general wage levels, which can make the home option more attractive. 3.2. Aggregate production function and the demand for labour The economy is represented by an aggregate firm that produces a single output, $$Y_t$$, combining labour (blue-collar and white-collar aggregate skill units, $$S_{Bt}$$ and $$S_{Wt}$$) and capital (structures and equipment, $$K_{St}$$ and $$K_{Et}$$) using a Constant Elasticity of Substitution (CES) technology described by the following production function: \begin{equation} Y_t = z_t K_{St}^\lambda \{ \alpha S_{Bt}^{\rho} + (1-\alpha) [ \theta S_{Wt}^{\gamma} + (1-\theta) K_{Et}^{\gamma} ]^{\rho/\gamma} \}^{(1-\lambda)/\rho}. \label{eq:production-function} \end{equation} (3.6) Equation (3.6) is a Cobb–Douglas combination of structures and a composite of labour and equipment capital. This composite is a CES aggregate of blue-collar labour and another CES aggregate, which combines equipment capital and white-collar labour. Parameters $$\alpha$$, $$\theta$$, and $$\lambda$$ are connected to the factor shares, and $$\rho$$ and $$\gamma$$ are related to the elasticities of substitution between the different inputs. The elasticity of substitution between equipment capital and white-collar labour is given by $$1/(1-\gamma)$$, and the elasticity of substitution between equipment capital or white-collar labour and blue-collar labour is $$1/(1-\rho)$$. Neutral technological progress is provided by the aggregate productivity shock $$z_t$$, whose evolution is described by: \begin{equation} \ln z_{t+1}-\ln z_t = \phi_0+\phi_1(\ln z_t-\ln z_{t-1})+\varepsilon^z_{t+1},\quad \varepsilon^z_{t+1}\sim\mathcal{N}(0,\sigma^2_z). \label{eq:shock-process} \end{equation} (3.7) This process allows for a linear trend in levels, with slope $$\phi_0$$, and business cycle fluctuations around it. The aggregate shock is assumed to be independent of idiosyncratic shocks, but it is allowed to be correlated with aggregate supplies of capital and labour (including immigrant inflows and the distribution of characteristics with which immigrants enter into the U.S.). Skill units are supplied by workers according to the exponential part of Equation (3.3). Even though, as noted above, I abstract from explicitly modelling individual migration decisions, in estimation I assume that immigrant inflows (and its distribution of skills) are determined endogenously following a known process (independent of idiosyncratic shocks but endogenous to aggregate fluctuations). Likewise, I also abstract from modelling individual savings decisions, and I proceed analogously with the aggregate capital supply.11 In the counterfactual experiments I simulate alternative scenarios with different assumptions about these processes, as discussed below. Equation (3.6) is somewhat different from the three-level nested CES proposed by Card and Lemieux (2001) and popularized in the immigration literature by Borjas (2003).12 In particular, it differs in three aspects: (1) it allows for capital-skill complementarity and skill-biased technical change, (2) it takes occupations into account and (3) instead of classifying individuals in skill cells based on educational level and age, it defines skills in a flexible way, accounting for observed and unobserved heterogeneity, in a similar spirit as in Dustmann et al. (2013). Capital-skill complementarity is important to account for skill-biased technical change. Krusell et al. (2000) show that, technical change reduced the relative price of equipment capital dramatically starting in early 1970s. Using a production function that resembles the one in Equation (3.6), they find that this technical change is skill-biased because $$\rho>\gamma$$ (meaning that equipment capital is more complementary to skilled labour than to unskilled labour). In particular, the increasing speed of accumulation of equipment capital increases the demand for white-collar workers. These authors find that this mechanism alone can explain most of the variation in the skill premium over the subsequent decades. In an equilibrium framework, not accounting for the increase in the demand of white-collar workers induced by the accumulation of equipment capital would lead to an overestimation of the reaction of natives to the inflow of immigrant blue collar workers. Allowing for different occupations in the production function is also important. Natives and immigrants may be imperfect substitutes in production because their different skills may lead them to different choices of occupations (Ottaviano and Peri, 2012, p. 175).13 The evidence provided by Ottaviano and Peri (2012) suggests that it is important to account for this imperfect substitution. In the present article, occupational choice endogenizes the extent to which this imperfect substitution between natives and immigrants shows in the data. Additionally, occupation specialization is an important adjustment mechanism employed by natives to react to the labour market competition induced by immigrants (Peri and Sparber, 2009). Finally, occupational switching is also an important determinant of the increase in wage inequality (Kambourov and Manovskii, 2009).14 Finally, Dustmann et al. (2013) discuss the importance of defining skill groups in a flexible way, departing from the skill-cell approach in Borjas (2003) and Ottaviano and Peri (2012). In particular, these authors note that immigrants downgrade upon entry into the destination country. As a result, they do not compete with the natives that share the same observable skills, but, instead, with those that work in the same jobs. Skill units, defined in Equation (3.3), determine, together with occupation, a more accurate measurement of labour market competition. They also generate further wage heterogeneity, which allows to quantify heterogeneous effects along the native wage distribution. And, importantly, despite all the extra flexibility, this approach is more parsimonious. 3.3. Expectations To make their decisions, individuals need to forecast the future path of the state variables, including future skill prices. The state vector at year $$t$$, $$\Omega_{a,t}$$, includes the following state variables: age, education, blue-collar and white-collar effective work experience, foreign potential experience, previous year decision, calendar year, number of children, idiosyncratic shocks, current skill prices, and the necessary information to forecast future skill prices. The first eight evolve deterministically given choices. Thus, workers only face uncertainty about future skill prices, number of children, and idiosyncratic shocks. Let $$F(n_{a+1},\varepsilon _{a+1},r^B_t,r^W_t|\Omega_{a,t})$$ denote the distribution of these variables in the next period conditional on the current state, with $$\varepsilon_a\equiv(\varepsilon^B_a,\varepsilon^W_a,\varepsilon^S_a,\varepsilon^H_a)'$$. I assume: \begin{align} \label{eq:transition-probabilities} F(n_{a+1},\varepsilon _{a+1},r^B_t,r^W_t|\Omega_{a,t})=F^\varepsilon (\varepsilon_{a+1})F^n (n_{a+1}|n_a,E_a,a,t)F^r(r^B_t,r^W_t|\Omega_{a,t}). \end{align} (3.8) Equation (3.8) implies that the processes for the idiosyncratic shock, number of children, and skill prices are independent. As noted above, $$F^\varepsilon (.)$$ is a multivariate normal with gender-specific parameters $$\Sigma_g\equiv(\sigma_{Bg},\sigma_{Wg},\sigma_{Sg},\sigma_{Hg},\rho_{BW})'$$, independent of individual-specific and aggregate variables. The assumption of independence with respect to aggregate supplies of skills and skill prices is consistent with assuming that individuals are atomistic (Altuğ and Miller, 1998). The fertility process $$F^n (.)$$ is conditional on education, age, calendar year, and current number of children, and it is independent of any other state variable, including current and future idiosyncratic and aggregate shocks (conditional on calendar year). Forecasting skill prices is more complicated. Future skill prices depend on future aggregate supplies of labour and capital, and on the aggregate productivity shock. The process for the aggregate shock is described by Equation (3.7). Future capital stocks in equilibrium depend on the future aggregate shock and labour supplies, and on the (unspecified) capital supply process. Future labour supply depends on future aggregate shock, capital stocks, and cohort sizes (including the future stock of immigrants), and on the future distribution of individual-specific state variables in the economy (Krusell and Smith, 1998). Under rational expectations, $$F^r(.)$$ should be such that individuals make the best possible forecast conditional on the available information in period $$t$$. Thus, to specify $$F^r(.)$$, one should specify a process for all the above, including the endogenous responses of immigration and capital stocks to the aggregate shock and to labour supply, and, importantly, the entire distribution of individual state variables in the economy. This is unfeasible. Alternatively, to make the problem tractable, I follow an approach that combines the approximation algorithm in Krusell and Smith (1998) and the framework in Altuğ and Miller (1998), in the same spirit as Lee and Wolpin (2006, 2010). Specifically, I approximate future skill prices by the following autoregressive process: \begin{equation} \Delta\ln r^j_{t+1}=\eta_0^j+\eta_j\Delta\ln r^j_t+\eta_z^j\Delta\ln z_{t+1}.\label{eq:expectations} \end{equation} (3.9) This rule is a good approximation to rational expectations if the parameter vector $${\eta\equiv \left(\eta_0^B,\eta_B,\eta_z^B,\eta_0^W,\eta_W,\eta_z^W\right)'}$$ is such that Equation (3.9) provides a good fit to the process for skill prices. As shown in Section 5, this seems to be the case: conditional on $$z_{t+1}$$, the process explains 99.9% of the variation in skill prices. Providing an almost perfect fit, however, does not mean that individuals perfectly foresee future skill prices, because $$z_{it+1}$$ is not observable at time $$t$$. Equation (3.9) is a reduced form of the model structure that individuals use to predict future skill prices. Thus, the vector $$\eta$$ is not really a vector of parameters, but, instead, an implicit function of the fundamental parameters of the model, and, hence, part of the solution. The fact that Equation (3.9) provides an extremely good fit of the process for skill prices indicates that current skill prices and especially the evolution of the aggregate shock are (almost) sufficient statistics to predict future skill prices for a given $$\eta$$.15 This is reasonable given that all aggregate processes are assumed to endogenously react to the aggregate productivity shock. For example, expectations about future immigration and its effect future wages are determined by expectations about the evolution of the aggregate shock (which determines future levels of immigration directly and indirectly through equilibrium adjustments), and its mapping into future wages (which includes wage effects of immigration). If a positive aggregate productivity shock is expected to lead an increase unskilled immigration, $$\eta^B_z$$ could be relatively small compared to $$\eta^W_z$$ (if unskilled immigration drives blue-collar relative wages down). Likewise, an unconditional expectation of low skilled immigration in the future could imply that $$\eta_0^B$$ is relatively small compared to $$\eta_0^W$$ (again, if unskilled immigration puts downward pressure to blue-collar wages). 3.4. Equilibrium The market structure of this economy is as follows. Supplies of capital and immigrants are given by processes determined outside of the model that endogenously react to aggregate conditions in the economy, but that are independent of individual unobservable characteristics.16 The aggregate supply of skill units in occupation $$j=B,W$$ is given by: \begin{equation} S_{jt}=\sum_{a=16}^{65}\sum_{i=1}^{N_{a,t}}s_{a,i}^j{\rm 1}\kern-0.24em{\rm I}\{d_{a,i}=j\}, \label{eq:aggregate-skills} \end{equation} (3.10) where $$N_{a,t}$$ is the cohort size. The aggregate demands are derived from the aggregate firm’s profit maximization, which equalizes marginal returns to rental prices. In particular, labour demands are given by: \begin{equation} r^B_t=(1-\lambda)\alpha \left(\frac{S_{Bt}}{KBW_t}\right)^\rho \frac{Y_t}{S _{Bt}},\label{eq:blue-skill-price} \end{equation} (3.11) for blue-collar skill units, where $$KBW_t\equiv \{ \alpha S_{Bt}^{\rho} + (1-\alpha) [ \theta S_{Wt}^{\gamma} + (1-\theta) K_{Et}^{\gamma} ]^{\rho/\gamma} \}^{1/\rho}$$ is the CES aggregate labour and equipment capital in Equation (3.6) and: \begin{equation} r^W_t=(1-\lambda)(1-\alpha)\theta \left(\frac{KW_t}{KBW_t}\right)^\rho \left(\frac{S _{Wt}}{KW_t}\right)^\gamma\frac{Y_t}{S _{Wt}},\label{eq:white-skill-price} \end{equation} (3.12) for white-collar skill units, where $$KW_t\equiv[ \theta S_{Wt}^{\gamma} + (1-\theta) K_{Et}^{\gamma} ]^{1/\gamma}$$ is the equipment capital-white-collar labour CES aggregate. Demands of structures and equipment capital are given by analogous expressions. The equilibrium is given by market clearing conditions. Equilibrium prices $$r^B_t$$, $$r^W_t$$, $$r^S_t$$ and $$r^E_t$$ (the last two are the rental prices of structures and equipment capital respectively) are such that supply and demand of immigration, of skill units in the U.S., and of capital are equalized. Empirically, baseline levels of immigration and capital in equilibrium are observed in the data. In counterfactuals, immigration levels are implied by the design of the policy experiment, and different scenarios for capital supply adjustment are simulated, as noted in Section 6. Baseline and counterfactual labour supplies are obtained by solving the equilibrium. Given equations (3.11) and (3.12) we can write the (log of the) relative white-collar to blue-collar skill price as: \begin{equation} \ln \frac{r^W_t}{r^B_t}=\ln \frac{(1-\alpha)\theta}{\alpha}+(\rho-1)\ln \frac{S _{Wt}}{S _{Bt}}+(\rho-\gamma)\ln \frac{KW_t}{S _{Wt}}. \label{eq:Tinbergen-race} \end{equation} (3.13) Equation (3.13) can be interpreted as a reformulation of Tinbergen’s race between technology and the supply of skills (Tinbergen, 1975).17 The second term of this equation is the negative contribution of the relative supply of skills (if $$\rho<1$$) and the last term captures the biased technical change through the increase in the speed of accumulation of equipment capital (whenever $$\rho>\gamma$$). Immigration changes the relative supplies of skills. Not allowing for capital-skill complementarity (imposing $$\gamma=\rho$$) would put all the burden of the increase in the relative wages observed in the last decades to the change in relative labour supplies. Since immigration pushed $$S _{Wt}/S _{Bt}$$ steadily down over the last decades coinciding with a period of important skilled-biased technical change, wrongly imposing $$\gamma=\rho$$ would induce a negative bias in the estimation of $$(\rho-1)$$, leading an over-prediction of the effect of immigration on relative wages. 4. Identification, data, and estimation This section gives an overview of the main identification arguments, describes the most important features of data construction, and introduces a sketch of the algorithm used for the solution and estimation of the model. A more thorough discussion of model identification is presented in Appendix A. Additionally, detailed descriptions of the solution/estimation algorithm, and of variable definitions and sample selection are available in the Online Appendix (Llull, 2017a). 4.1. Identification The following discussion builds on previous work by Hotz and Miller (1993), Altuğ and Miller (1998), Magnac and Thesmar (2002), Arcidiacono and Miller (2011, 2015), and Kristensen et al. (2015). The main assumption exploited for identification is that the idiosyncratic shocks are independent of all other state variables. Identification also relies on the additional assumption that conditional choice probabilities (CCPs) are identified non-parametrically from observed decisions in the data. The latter is not trivial in practice, because the aggregate shock and the skill prices, which are not observable, are state variables, and because only partitions of the state vector are included in each of the datasets used in estimation. To simplify the argument, I assume hereinafter that CCPs are identified (identification of the CCPs is discussed in Appendix A). The wage equations are identified following standard arguments in the self-selection literature (Heckman, 1974, 1979). In particular, we can use a control function approach that corrects for the bias induced by the fact that the disturbance is not zero-mean conditional on $$d_a=j$$. This control function is a mapping on the CCPs (Heckman and Robb, 1986; Hotz and Miller, 1993). Given the parametric assumption for the distribution of $$\varepsilon_a$$, this mapping is known. But even if it was not, it would be non-parametrically identified since the model provides exclusion restrictions, like the fact that the number of children affects participation but not wages (Ahn and Powell, 1993; Das et al., 2003). Aggregate skill prices, which are not observable, are identified as the coefficients of calendar time dummies. This requires a normalization of one of the intercepts, $$w^j_{0,l}$$ for some $$l$$, in each wage equation; I normalize native-male intercepts in both equations to zero. Identification of skill prices leads to identification of the production function parameters and the aggregate shock. Individual skill units are identified as $${s^j_{a,i}=\exp\{\ln w^j_{a,t,l}-\ln r^j_t\}}$$, and aggregate skill units are identified integrating $$s^j_{a,i}$$ over the sample of individuals working in occupation $$j$$. Using data on output $$Y_t$$ and capital $$K_{Et}$$ and $$K_{St}$$, the production function parameters are identified from the first-order conditions of the firm’s problem—if at least three periods are available—without imposing any additional orthogonality condition (see Appendix A). Given them, aggregate shocks $$\{z_t\}_{t=t_0}^T$$ are identified as residuals in Equation (3.6), and the AR(1) coefficients for the shock process ($$\phi_0$$ and $$\phi_1$$ in Equation (3.7)) are identified as regression coefficients. Likewise, combining $$\{z_t\}_{t=t_0}^T$$ and $$\{r^B_t,r^W_t\}_{t=t_0}^T$$, the equilibrium value of $$\eta$$ is identified from Equation (3.9). The identification of the remaining parameters of the model follows standard arguments in the literature. I fix the discount factor $$\beta$$, which is proved to be identified only through the functional form assumptions of the model (Magnac and Thesmar, 2002). The parameters that remain to be identified are $$\delta^{BW}_g$$, $$\delta^S_{0,l}$$, $$\delta_{1,g}^S$$, $$\tau_1$$, $$\tau_2$$, $$\sigma_{Sg}$$, $$\delta^H_{0,l}$$, $$\delta^H_{1,g}$$, $$\delta^H_{2,g}$$, and $$\sigma_{Hg}$$. Proposition 1 in Hotz and Miller (1993) establishes that the mapping between value functions and CCPs can be inverted so that we can express continuation values as a function of the CCPs. Kristensen et al. (2015) prove that this result still holds in the case in which utility functions do not satisfy additive separability, as it is the case here. Identification of the dynamic model is thus provided by observed choices and CCPs for future periods. Even though the discrete choices are made based on the difference between the utility obtained from each alternative and the one obtained from a base alternative, there is no need for further normalizations. The reason is that $$\delta^{BW}_g$$ is common for blue-collar and white-collar alternatives, and that the parameters from the wage equation, including variance–covariance parameters, have already been identified above, which is ultimately the result of observing wages. Three remarks about identification are worth highlighting. First, the parameters of the production function are identified from the variation in the micro data on wages and choices. This permits identification without imposing orthogonality conditions between aggregate variables and the aggregate shock, allowing capital stocks and immigration to be endogenous to aggregate fluctuations.18$$^,$$19 If specified, the endogenous processes for the supply of immigrants and capital would predict equilibrium quantities as a function of the aggregate shock. Observed capital and immigrant stocks are assumed to be realizations of these equilibrium outcomes in the baseline economy, and thus, they are sufficient statistics for the processes that generated them. As a result, identification is achieved without specifying these processes. However, observed values of the same aggregates are no longer equilibrium values in counterfactual scenarios. Thus, counterfactual simulations require additional assumptions about the processes as discussed below. Second, permanent (or even persistent) unobserved heterogeneity cannot be identified in this model because immigrants are not observed in their home countries prior to migration. As discussed in Aguirregabiria and Mira (2010, p. 55), in the model with unobserved heterogeneity, the initial condition would potentially be endogenous because it would not be independent of the individual’s unobservable. More specifically, there would be a self-selection of who migrates and when based on these unobservables (Borjas, 1987). To recover the parameters of such model one would need to specify migration decisions which, as discussed above, is not feasible. Alternatively, one could try to identify the distribution of types for each possible value of observables at entry, which include education, age, and region of origin. However, that would increase the computational burden and the parameter space so much that it would also be unfeasible. And, third, education decisions (and individual decisions in general) are identified in the model through the exclusion restrictions, the observation of wages conditional on education, and the conditional choice probabilities, which embed individuals’ expectations about future returns to education. Identification relies on the assumption that the current level of education is uncorrelated with the idiosyncratic shock. It also depends on expectations about future equilibrium wages, which are approximated by the rule presented above and identified with quite unrestrictive assumptions. As it is shown below, the model is able to replicate the observed evolution of individual choices, distribution of state variables, one-year transitions, wage profiles, and returns to education, which is reassuring. 4.2. Model solution and estimation For estimation, it is convenient to differentiate two types of parameters: expectation parameters, $$\Theta_2$$, which include the forecasting rule described in Equation (3.9), and the process for the aggregate shock in Equation (3.7), and the fundamental parameters of the model, $$\Theta_1$$, which include the remaining parameters. Parameters $$\Theta_2$$ are part of the solution of the model, and hence can be expressed as $$\Theta_2(\Theta_1)$$.20 Parameters $$\Theta_1$$ are estimated by Simulated Minimum Distance (SMD). The SMD estimator minimizes the distance between a set of statistics obtained from microdata, listed in Section 4.3, and their counterparts predicted by the model. The expectation parameters $$\Theta_2(\Theta_1)$$ are obtained as a fixed point in an algorithm that obtains equilibrium skill prices and aggregate shocks simulating the behaviour of individuals who form their expectations using a guess of $$\Theta_2$$, and updates guess fitting Equations (3.7) and (3.9) to the simulated data. Thus, the estimation process requires a nested algorithm that estimates $$\Theta_1$$, and solves for $$\Theta_2$$ given $$\Theta_1$$. Lee and Wolpin (2006, 2010) describe a nested algorithm with an inner procedure that finds the fixed point in $$\Theta_2$$ for every guess of $$\Theta_1$$, and an outer loop that finds $$\Theta_1$$ using a polytope minimization algorithm. The main problem with this procedure is that it requires solving the fixed point problem in every evaluation of $$\Theta_1$$, and this increases the computational burden significantly.21 Alternatively, I propose an algorithm that avoids this problem by swapping the two procedures, in the spirit of Aguirregabiria and Mira (2002). In particular, $$\Theta_1$$ is estimated for every guess of $$\Theta_2$$, which is updated at a lower frequency. In other words, I estimate $$\Theta_1(\Theta_2)$$ for every guess of $$\Theta_2$$. This algorithm is described in detail in the Online Appendix (Llull, 2017a). 4.3. Data To estimate the model I fit 27,636 statistics computed with micro-data from 1967 to 2007 obtained from the March Supplement of the Current Population Survey (CPS), and the two cohorts of the National Longitudinal Survey of Youth (NLSY79 and NLSY97). These statistics, listed in Table 2, include information on choice probabilities, one-year transitions, distributions of education and experience, and mean, log-first difference, and variance of wages, all this conditional on observable characteristics. Additionally, aggregate data on output and the stocks of structures and equipment capital from the Bureau of Economic Analysis, and on cohort sizes (by gender and immigrant status), distribution of entry age for immigrants, distribution of initial schooling (at the age of 16 years for natives and at entry for immigrants), and distribution of immigrants by region of birth from the Census are used in the solution and estimation. The transition probability process for fertility (number of preschool children) is directly estimated from observed transitions in the data (Census and CPS). Data sources, sample construction, and variable definitions are detail in the Online Appendix (Llull, 2017a). TABLE 2 Data Group of statistics Source Number of statistics TOTAL 27,636 Proportion of individuals choosing each alternative.. 5,074 By year, sex, and 5-year age group CPS $$41\times2\times10\times(4-1)$$ 2,460 By year, sex, and educational level CPS $$41\times2\times4\times(4-1)$$ 984 By year, sex, and preschool children CPS $$41\times2\times3\times(4-1)$$ 738 By year, sex, and region of origin CPS $$15\times2\times4\times(4-1)$$ 360 Immigr., by year, sex, and foreign potential exp. CPS $$15\times2\times5\times(4-1)$$ 450 By sex and experience in each occupation NLSY $$2\times(5\times5+4\times4)\times(2-1)$$ 82 Wages: 6,044 Mean log hourly real wage... 3,000 By year, sex, 5-year age group, and occupation CPS $$41\times2\times10\times2$$ 1,640 By year, sex, educational level, and occupation CPS $$41\times2\times4\times2$$ 656 By year, sex, region of origin, and occupation CPS $$15\times2\times4\times2$$ 240 Immigrants, by year, sex, fpx, and occupation CPS $$15\times2\times5\times2$$ 300 By sex, experience in each occupation, and occ. NLSY $$2\times(5\times5+4\times4)\times2$$ 164 Mean 1-year growth rates in log hourly real wage... 2,148 By year, sex, previous, and current occupation CPS$$^\S$$ $$35\times2\times2\times2$$ 280 By year, sex, 5-year age group, and current occ. CPS$$^\S$$ $$35\times2\times10\times2$$ 1,400 By year, sex, region of origin, and current occ. CPS$$^\S$$ $$13\times2\times4\times2$$ 208 Immigr., by year, sex, years in the U.S., and occ. CPS$$^\S$$ $$13\times2\times5\times2$$ 260 Variance in the log hourly real wages... 896 By year, sex, educational level, and occupation CPS $$41\times2\times4\times2$$ 656 By year, sex, region of origin, and occupation CPS $$15\times2\times4\times2$$ 240 Career transitions... 12,138 By year and sex CPS$$^\S$$ $$35\times2\times4\times(4-1)$$ 840 By year, sex, and age CPS$$^\S$$ $$35\times2\times10\times4\times(4-1)$$ 8,400 By year, sex, and region of origin CPS$$^\S$$ $$13\times2\times4\times4\times(4-1)$$ 1,248 New entrants taking each choice by year and sex CPS $$15\times2\times(4-1)$$ 90 Immigrants, by year, sex, and years in the U.S. CPS$$^\S$$ $$13\times2\times5\times4\times(4-1)$$ 1,560 Distribution of highest grade completed... 4,260 By year, sex, and 5-year age group CPS $$41\times2\times10\times(4-1)$$ 2,460 By year, sex, 5-year age group, and immigr./native CPS $$15\times2\times10\times2\times(4-1)$$ 1,800 Distribution of experience... 120 Blue collar, by sex NLSY $$2\times(13+7)$$ 40 White collar, by sex NLSY $$2\times(13+7)$$ 40 Home, by sex NLSY $$2\times(13+7)$$ 40 Group of statistics Source Number of statistics TOTAL 27,636 Proportion of individuals choosing each alternative.. 5,074 By year, sex, and 5-year age group CPS $$41\times2\times10\times(4-1)$$ 2,460 By year, sex, and educational level CPS $$41\times2\times4\times(4-1)$$ 984 By year, sex, and preschool children CPS $$41\times2\times3\times(4-1)$$ 738 By year, sex, and region of origin CPS $$15\times2\times4\times(4-1)$$ 360 Immigr., by year, sex, and foreign potential exp. CPS $$15\times2\times5\times(4-1)$$ 450 By sex and experience in each occupation NLSY $$2\times(5\times5+4\times4)\times(2-1)$$ 82 Wages: 6,044 Mean log hourly real wage... 3,000 By year, sex, 5-year age group, and occupation CPS $$41\times2\times10\times2$$ 1,640 By year, sex, educational level, and occupation CPS $$41\times2\times4\times2$$ 656 By year, sex, region of origin, and occupation CPS $$15\times2\times4\times2$$ 240 Immigrants, by year, sex, fpx, and occupation CPS $$15\times2\times5\times2$$ 300 By sex, experience in each occupation, and occ. NLSY $$2\times(5\times5+4\times4)\times2$$ 164 Mean 1-year growth rates in log hourly real wage... 2,148 By year, sex, previous, and current occupation CPS$$^\S$$ $$35\times2\times2\times2$$ 280 By year, sex, 5-year age group, and current occ. CPS$$^\S$$ $$35\times2\times10\times2$$ 1,400 By year, sex, region of origin, and current occ. CPS$$^\S$$ $$13\times2\times4\times2$$ 208 Immigr., by year, sex, years in the U.S., and occ. CPS$$^\S$$ $$13\times2\times5\times2$$ 260 Variance in the log hourly real wages... 896 By year, sex, educational level, and occupation CPS $$41\times2\times4\times2$$ 656 By year, sex, region of origin, and occupation CPS $$15\times2\times4\times2$$ 240 Career transitions... 12,138 By year and sex CPS$$^\S$$ $$35\times2\times4\times(4-1)$$ 840 By year, sex, and age CPS$$^\S$$ $$35\times2\times10\times4\times(4-1)$$ 8,400 By year, sex, and region of origin CPS$$^\S$$ $$13\times2\times4\times4\times(4-1)$$ 1,248 New entrants taking each choice by year and sex CPS $$15\times2\times(4-1)$$ 90 Immigrants, by year, sex, and years in the U.S. CPS$$^\S$$ $$13\times2\times5\times4\times(4-1)$$ 1,560 Distribution of highest grade completed... 4,260 By year, sex, and 5-year age group CPS $$41\times2\times10\times(4-1)$$ 2,460 By year, sex, 5-year age group, and immigr./native CPS $$15\times2\times10\times2\times(4-1)$$ 1,800 Distribution of experience... 120 Blue collar, by sex NLSY $$2\times(13+7)$$ 40 White collar, by sex NLSY $$2\times(13+7)$$ 40 Home, by sex NLSY $$2\times(13+7)$$ 40 Data are drawn from March Supplements of the Current Population Surveys for survey years 1968 to 2008 (CPS), the National Longitudinal Survey of Youth both for 1979 and 1997 cohorts (NLSY), and the CPS matched over two consecutive years —survey years 1971–2, 1972–3, 1976–7, 1985–6 and 1995–6 can not be matched—(CPS$$^\S$$). Statistics from the CPS that distinguish between natives and immigrants can only be computed for surveys from 1994 on. There are 10 five-year age groups (ages 16–65 years), two genders (male and female), two immigrant status (native and immigrant), four regions of origin (U.S. (natives), Western countries, Latin America, and Asia/Africa), four educational levels ($$<$$12,12,13–15 and 16$$+$$ years of education), three categories of preschool children living at home (0, 1 and 2$$+$$), and five foreign potential experience (fpx) and years in the country groups (0–2,3–5,6–8,9–11, and 12$$+$$ years). Redundant statistics that are linear combinations of others ($$e.g.$$ probabilities add up to one) are not included (neither in the table, nor in the estimation). TABLE 2 Data Group of statistics Source Number of statistics TOTAL 27,636 Proportion of individuals choosing each alternative.. 5,074 By year, sex, and 5-year age group CPS $$41\times2\times10\times(4-1)$$ 2,460 By year, sex, and educational level CPS $$41\times2\times4\times(4-1)$$ 984 By year, sex, and preschool children CPS $$41\times2\times3\times(4-1)$$ 738 By year, sex, and region of origin CPS $$15\times2\times4\times(4-1)$$ 360 Immigr., by year, sex, and foreign potential exp. CPS $$15\times2\times5\times(4-1)$$ 450 By sex and experience in each occupation NLSY $$2\times(5\times5+4\times4)\times(2-1)$$ 82 Wages: 6,044 Mean log hourly real wage... 3,000 By year, sex, 5-year age group, and occupation CPS $$41\times2\times10\times2$$ 1,640 By year, sex, educational level, and occupation CPS $$41\times2\times4\times2$$ 656 By year, sex, region of origin, and occupation CPS $$15\times2\times4\times2$$ 240 Immigrants, by year, sex, fpx, and occupation CPS $$15\times2\times5\times2$$ 300 By sex, experience in each occupation, and occ. NLSY $$2\times(5\times5+4\times4)\times2$$ 164 Mean 1-year growth rates in log hourly real wage... 2,148 By year, sex, previous, and current occupation CPS$$^\S$$ $$35\times2\times2\times2$$ 280 By year, sex, 5-year age group, and current occ. CPS$$^\S$$ $$35\times2\times10\times2$$ 1,400 By year, sex, region of origin, and current occ. CPS$$^\S$$ $$13\times2\times4\times2$$ 208 Immigr., by year, sex, years in the U.S., and occ. CPS$$^\S$$ $$13\times2\times5\times2$$ 260 Variance in the log hourly real wages... 896 By year, sex, educational level, and occupation CPS $$41\times2\times4\times2$$ 656 By year, sex, region of origin, and occupation CPS $$15\times2\times4\times2$$ 240 Career transitions... 12,138 By year and sex CPS$$^\S$$ $$35\times2\times4\times(4-1)$$ 840 By year, sex, and age CPS$$^\S$$ $$35\times2\times10\times4\times(4-1)$$ 8,400 By year, sex, and region of origin CPS$$^\S$$ $$13\times2\times4\times4\times(4-1)$$ 1,248 New entrants taking each choice by year and sex CPS $$15\times2\times(4-1)$$ 90 Immigrants, by year, sex, and years in the U.S. CPS$$^\S$$ $$13\times2\times5\times4\times(4-1)$$ 1,560 Distribution of highest grade completed... 4,260 By year, sex, and 5-year age group CPS $$41\times2\times10\times(4-1)$$ 2,460 By year, sex, 5-year age group, and immigr./native CPS $$15\times2\times10\times2\times(4-1)$$ 1,800 Distribution of experience... 120 Blue collar, by sex NLSY $$2\times(13+7)$$ 40 White collar, by sex NLSY $$2\times(13+7)$$ 40 Home, by sex NLSY $$2\times(13+7)$$ 40 Group of statistics Source Number of statistics TOTAL 27,636 Proportion of individuals choosing each alternative.. 5,074 By year, sex, and 5-year age group CPS $$41\times2\times10\times(4-1)$$ 2,460 By year, sex, and educational level CPS $$41\times2\times4\times(4-1)$$ 984 By year, sex, and preschool children CPS $$41\times2\times3\times(4-1)$$ 738 By year, sex, and region of origin CPS $$15\times2\times4\times(4-1)$$ 360 Immigr., by year, sex, and foreign potential exp. CPS $$15\times2\times5\times(4-1)$$ 450 By sex and experience in each occupation NLSY $$2\times(5\times5+4\times4)\times(2-1)$$ 82 Wages: 6,044 Mean log hourly real wage... 3,000 By year, sex, 5-year age group, and occupation CPS $$41\times2\times10\times2$$ 1,640 By year, sex, educational level, and occupation CPS $$41\times2\times4\times2$$ 656 By year, sex, region of origin, and occupation CPS $$15\times2\times4\times2$$ 240 Immigrants, by year, sex, fpx, and occupation CPS $$15\times2\times5\times2$$ 300 By sex, experience in each occupation, and occ. NLSY $$2\times(5\times5+4\times4)\times2$$ 164 Mean 1-year growth rates in log hourly real wage... 2,148 By year, sex, previous, and current occupation CPS$$^\S$$ $$35\times2\times2\times2$$ 280 By year, sex, 5-year age group, and current occ. CPS$$^\S$$ $$35\times2\times10\times2$$ 1,400 By year, sex, region of origin, and current occ. CPS$$^\S$$ $$13\times2\times4\times2$$ 208 Immigr., by year, sex, years in the U.S., and occ. CPS$$^\S$$ $$13\times2\times5\times2$$ 260 Variance in the log hourly real wages... 896 By year, sex, educational level, and occupation CPS $$41\times2\times4\times2$$ 656 By year, sex, region of origin, and occupation CPS $$15\times2\times4\times2$$ 240 Career transitions... 12,138 By year and sex CPS$$^\S$$ $$35\times2\times4\times(4-1)$$ 840 By year, sex, and age CPS$$^\S$$ $$35\times2\times10\times4\times(4-1)$$ 8,400 By year, sex, and region of origin CPS$$^\S$$ $$13\times2\times4\times4\times(4-1)$$ 1,248 New entrants taking each choice by year and sex CPS $$15\times2\times(4-1)$$ 90 Immigrants, by year, sex, and years in the U.S. CPS$$^\S$$ $$13\times2\times5\times4\times(4-1)$$ 1,560 Distribution of highest grade completed... 4,260 By year, sex, and 5-year age group CPS $$41\times2\times10\times(4-1)$$ 2,460 By year, sex, 5-year age group, and immigr./native CPS $$15\times2\times10\times2\times(4-1)$$ 1,800 Distribution of experience... 120 Blue collar, by sex NLSY $$2\times(13+7)$$ 40 White collar, by sex NLSY $$2\times(13+7)$$ 40 Home, by sex NLSY $$2\times(13+7)$$ 40 Data are drawn from March Supplements of the Current Population Surveys for survey years 1968 to 2008 (CPS), the National Longitudinal Survey of Youth both for 1979 and 1997 cohorts (NLSY), and the CPS matched over two consecutive years —survey years 1971–2, 1972–3, 1976–7, 1985–6 and 1995–6 can not be matched—(CPS$$^\S$$). Statistics from the CPS that distinguish between natives and immigrants can only be computed for surveys from 1994 on. There are 10 five-year age groups (ages 16–65 years), two genders (male and female), two immigrant status (native and immigrant), four regions of origin (U.S. (natives), Western countries, Latin America, and Asia/Africa), four educational levels ($$<$$12,12,13–15 and 16$$+$$ years of education), three categories of preschool children living at home (0, 1 and 2$$+$$), and five foreign potential experience (fpx) and years in the country groups (0–2,3–5,6–8,9–11, and 12$$+$$ years). Redundant statistics that are linear combinations of others ($$e.g.$$ probabilities add up to one) are not included (neither in the table, nor in the estimation). There are four aspects of the data construction that are worth describing in more detail here. First, not all necessary information is contained in a single dataset that can be used in estimation. The CPS has a short panel dimension that allows me to compute transition probabilities, but does not include information on effective experience in blue-collar and white-collar. The NLSY is a long panel, and allows me to compute effective experience, but it only follows two cohorts over time (which would make the identification of equilibrium quantities and production function parameters less credible) and it is not refreshed with new cohorts of immigrants. The need of combining these two datasets prevents direct estimation of choice probabilities conditional on all the observable state variables at a time, and, hence, the implementation of non-full solution methods (CCP estimation). Second, individuals need to be assigned to mutually exclusive annual choices. I do so following a similar approach to Lee and Wolpin (2006). Individuals are assigned to school if attending school was their main activity at the time of the survey. Else, they are assigned to work if they worked at least 40 weeks during the year preceding the survey, and at least 20 hours per week. If working, they are assigned an occupation based on the main occupation held in the previous year (CPS) or most recent one (NLSY). Blue collar occupations include craftsmen, operatives, service workers, laborers, and farmers, and the white-collar group includes professionals, clerks, sales workers, managers and farm managerial occupations. The remaining individuals are assigned to the home alternative. Third, it is important to provide some notes about the measurement of immigration. Immigrants are defined as individuals born abroad. In the CPS, immigrants are only identifiable starting in 1993. For this reason, all statistics in Table 2 that are conditional on immigrant status are computed only for the period 1993–2007. This is used below to check the goodness of the model in fitting choices and wages of immigrants for the period before 1993, which constitutes a sort of out-of-the-sample fit check. Data from the CPS and the U.S. Census are assumed to include both legal and undocumented aliens. In fact, these datasets are used by the literature and by policy makers to quantify the importance of illegal immigration using the residual method (Hanson, 2006; Baker and Rytina, 2013). However, it is generally accepted that they undercount the total stock of undocumented immigrants to some extent (Hanson, 2006). Some papers, like Lessem (2015) and others surveyed in Hanson (2006) use the Mexican Migration Project, which includes information about legal status, but their focus is on Mexican migration. And, fourth, in estimation I compare data statistics with simulated counterparts. Simulated statistics are obtained from simulating the behaviour of cohorts of 5,000 natives and 5,000 immigrants (some starting life abroad and only making decisions once in the U.S.). Thus, each simulated cross-section includes up to $$(5,000+5,000)\times50=500,000$$ observations, which are weighted using data on cohort sizes. 5. Estimation results 5.1. Parameter estimates This Section discusses parameter estimates, listed in Tables 3 and 4. I first present estimates for the fundamental parameters of the model, which are those in Equations (3.2) through Equation (3.7). Then I present the recovered equilibrium values for $$\eta$$ in Equation (3.9), along with some goodness of fit statistics. TABLE 3 Parameter estimates A. Origin$$\times$$gender constants Nat. male Nat. female Western countries Latin America Asia/Africa Blue collar 0 $$-$$0.341 0.087 $$-$$0.032 $$-$$0.021 (0.0010) (0.0292) (0.0169) (0.0060) White collar 0 $$-$$0.291 0.135 $$-$$0.161 0.152 (0.0015) (0.0355) (0.0169) (0.0140) School 2,186 5,866 8,291 2,302 28,207 (68) (84) (249) (864) (382) Home 16,420 11,333 16,957 11,259 15,162 (53) (29) (764) (240) (143) A. Origin$$\times$$gender constants Nat. male Nat. female Western countries Latin America Asia/Africa Blue collar 0 $$-$$0.341 0.087 $$-$$0.032 $$-$$0.021 (0.0010) (0.0292) (0.0169) (0.0060) White collar 0 $$-$$0.291 0.135 $$-$$0.161 0.152 (0.0015) (0.0355) (0.0169) (0.0140) School 2,186 5,866 8,291 2,302 28,207 (68) (84) (249) (864) (382) Home 16,420 11,333 16,957 11,259 15,162 (53) (29) (764) (240) (143) B. Wage equations Blue collar White collar Education—Natives ($$\omega_{1,nat}$$) 0.072 (0.0001) 0.110 (0.0001) Education—Immigr. ($$\omega_{1,imm}$$) 0.058 (0.0005) 0.109 (0.0004) BC Experience ($$\omega_2$$) 0.094 (0.0001) 0.001 (0.0002) BC Experience squared ($$\omega_3$$) $$-$$0.0023 (0.00001) $$-$$0.0006 (0.00002) WC Experience ($$\omega_4$$) 0.028 (0.0002) 0.106 (0.0002) WC Experience squared ($$\omega_5$$) $$-$$0.0013 (0.00001) $$-$$0.0030 (0.00001) Foreign Experience ($$\omega_6$$) 0.017 (0.0005) $$-$$0.059 (0.0012) Variance–covariance matrix of i.i.d. shocks Std. dev. male ($$\sigma_{male}$$) 0.452 (0.0059) 0.589 (0.0024) Std. dev. female ($$\sigma_{female}$$) 0.389 (0.0042) 0.476 (0.0033) Correlation coefficient ($$\rho_{BW}$$) 0.048 (0.0043) B. Wage equations Blue collar White collar Education—Natives ($$\omega_{1,nat}$$) 0.072 (0.0001) 0.110 (0.0001) Education—Immigr. ($$\omega_{1,imm}$$) 0.058 (0.0005) 0.109 (0.0004) BC Experience ($$\omega_2$$) 0.094 (0.0001) 0.001 (0.0002) BC Experience squared ($$\omega_3$$) $$-$$0.0023 (0.00001) $$-$$0.0006 (0.00002) WC Experience ($$\omega_4$$) 0.028 (0.0002) 0.106 (0.0002) WC Experience squared ($$\omega_5$$) $$-$$0.0013 (0.00001) $$-$$0.0030 (0.00001) Foreign Experience ($$\omega_6$$) 0.017 (0.0005) $$-$$0.059 (0.0012) Variance–covariance matrix of i.i.d. shocks Std. dev. male ($$\sigma_{male}$$) 0.452 (0.0059) 0.589 (0.0024) Std. dev. female ($$\sigma_{female}$$) 0.389 (0.0042) 0.476 (0.0033) Correlation coefficient ($$\rho_{BW}$$) 0.048 (0.0043) C. Utility parameters Male Femal Labor market reentry cost ($$\delta_1^{BW}$$) 8,968 (77) 12,400 (180) School utility parameters Undergraduate Tuition ($$\tau_1$$) 13,841 (85) Graduate Tuition ($$\tau_1+\tau_2$$) 70,970 (869) Disutility of school reentry ($$\delta_1^S$$) 29,009 (207) 37,357 (597) Home utility parameters Children ($$\delta_1^H$$) $$-$$1,799 (47) 3,626 (75) Trend ($$\delta_2^H$$) 62.92 (0.83) 53.77 (0.55) Standard dev. of i.i.d. shocks School ($$\sigma^S$$) 1,150 (8) 215 (2) Home ($$\sigma^H$$) 10,227 (638) 5,316 (243) C. Utility parameters Male Femal Labor market reentry cost ($$\delta_1^{BW}$$) 8,968 (77) 12,400 (180) School utility parameters Undergraduate Tuition ($$\tau_1$$) 13,841 (85) Graduate Tuition ($$\tau_1+\tau_2$$) 70,970 (869) Disutility of school reentry ($$\delta_1^S$$) 29,009 (207) 37,357 (597) Home utility parameters Children ($$\delta_1^H$$) $$-$$1,799 (47) 3,626 (75) Trend ($$\delta_2^H$$) 62.92 (0.83) 53.77 (0.55) Standard dev. of i.i.d. shocks School ($$\sigma^S$$) 1,150 (8) 215 (2) Home ($$\sigma^H$$) 10,227 (638) 5,316 (243) Elast. of substit. param. BC vs WC vs Factor share parameters D. Production function Eq. ($$\rho$$) Eq. ($$\gamma$$) Struct. ($$\lambda$$) BC ($$\alpha$$) WC ($$\theta$$) 0.288 $$-$$0.059 0.073 0.555 0.452 (0.006) (0.005) (0.011) (0.007) (0.010) Elast. of substit. param. BC vs WC vs Factor share parameters D. Production function Eq. ($$\rho$$) Eq. ($$\gamma$$) Struct. ($$\lambda$$) BC ($$\alpha$$) WC ($$\theta$$) 0.288 $$-$$0.059 0.073 0.555 0.452 (0.006) (0.005) (0.011) (0.007) (0.010) E. Aggregate shock process Constant ($$\phi_0$$) Autoregressive term ($$\phi_1$$) St. dev. of innovations ($$\sigma_z$$) 0.001 0.328 0.026 (0.003) (0.114) (0.021) E. Aggregate shock process Constant ($$\phi_0$$) Autoregressive term ($$\phi_1$$) St. dev. of innovations ($$\sigma_z$$) 0.001 0.328 0.026 (0.003) (0.114) (0.021) The table presents parameter estimates for Equations (3.2)–(3.7). Native male constants for wage equations are normalized to zero. Immigrant male and native female constants are estimated. The constant for a female immigrant from region $$i$$ is obtained as the sum of the constant for a region $$i$$ male immigrant and the difference between the constant for native females and native males. The individual subjective discount factor, $$\beta$$, is set to 0.95. Standard errors (calculated as described in Appendix B) are in parentheses. Standard errors for the aggregate shock process are regression standard errors. TABLE 3 Parameter estimates A. Origin$$\times$$gender constants Nat. male Nat. female Western countries Latin America Asia/Africa Blue collar 0 $$-$$0.341 0.087 $$-$$0.032 $$-$$0.021 (0.0010) (0.0292) (0.0169) (0.0060) White collar 0 $$-$$0.291 0.135 $$-$$0.161 0.152 (0.0015) (0.0355) (0.0169) (0.0140) School 2,186 5,866 8,291 2,302 28,207 (68) (84) (249) (864) (382) Home 16,420 11,333 16,957 11,259 15,162 (53) (29) (764) (240) (143) A. Origin$$\times$$gender constants Nat. male Nat. female Western countries Latin America Asia/Africa Blue collar 0 $$-$$0.341 0.087 $$-$$0.032 $$-$$0.021 (0.0010) (0.0292) (0.0169) (0.0060) White collar 0 $$-$$0.291 0.135 $$-$$0.161 0.152 (0.0015) (0.0355) (0.0169) (0.0140) School 2,186 5,866 8,291 2,302 28,207 (68) (84) (249) (864) (382) Home 16,420 11,333 16,957 11,259 15,162 (53) (29) (764) (240) (143) B. Wage equations Blue collar White collar Education—Natives ($$\omega_{1,nat}$$) 0.072 (0.0001) 0.110 (0.0001) Education—Immigr. ($$\omega_{1,imm}$$) 0.058 (0.0005) 0.109 (0.0004) BC Experience ($$\omega_2$$) 0.094 (0.0001) 0.001 (0.0002) BC Experience squared ($$\omega_3$$) $$-$$0.0023 (0.00001) $$-$$0.0006 (0.00002) WC Experience ($$\omega_4$$) 0.028 (0.0002) 0.106 (0.0002) WC Experience squared ($$\omega_5$$) $$-$$0.0013 (0.00001) $$-$$0.0030 (0.00001) Foreign Experience ($$\omega_6$$) 0.017 (0.0005) $$-$$0.059 (0.0012) Variance–covariance matrix of i.i.d. shocks Std. dev. male ($$\sigma_{male}$$) 0.452 (0.0059) 0.589 (0.0024) Std. dev. female ($$\sigma_{female}$$) 0.389 (0.0042) 0.476 (0.0033) Correlation coefficient ($$\rho_{BW}$$) 0.048 (0.0043) B. Wage equations Blue collar White collar Education—Natives ($$\omega_{1,nat}$$) 0.072 (0.0001) 0.110 (0.0001) Education—Immigr. ($$\omega_{1,imm}$$) 0.058 (0.0005) 0.109 (0.0004) BC Experience ($$\omega_2$$) 0.094 (0.0001) 0.001 (0.0002) BC Experience squared ($$\omega_3$$) $$-$$0.0023 (0.00001) $$-$$0.0006 (0.00002) WC Experience ($$\omega_4$$) 0.028 (0.0002) 0.106 (0.0002) WC Experience squared ($$\omega_5$$) $$-$$0.0013 (0.00001) $$-$$0.0030 (0.00001) Foreign Experience ($$\omega_6$$) 0.017 (0.0005) $$-$$0.059 (0.0012) Variance–covariance matrix of i.i.d. shocks Std. dev. male ($$\sigma_{male}$$) 0.452 (0.0059) 0.589 (0.0024) Std. dev. female ($$\sigma_{female}$$) 0.389 (0.0042) 0.476 (0.0033) Correlation coefficient ($$\rho_{BW}$$) 0.048 (0.0043) C. Utility parameters Male Femal Labor market reentry cost ($$\delta_1^{BW}$$) 8,968 (77) 12,400 (180) School utility parameters Undergraduate Tuition ($$\tau_1$$) 13,841 (85) Graduate Tuition ($$\tau_1+\tau_2$$) 70,970 (869) Disutility of school reentry ($$\delta_1^S$$) 29,009 (207) 37,357 (597) Home utility parameters Children ($$\delta_1^H$$) $$-$$1,799 (47) 3,626 (75) Trend ($$\delta_2^H$$) 62.92 (0.83) 53.77 (0.55) Standard dev. of i.i.d. shocks School ($$\sigma^S$$) 1,150 (8) 215 (2) Home ($$\sigma^H$$) 10,227 (638) 5,316 (243) C. Utility parameters Male Femal Labor market reentry cost ($$\delta_1^{BW}$$) 8,968 (77) 12,400 (180) School utility parameters Undergraduate Tuition ($$\tau_1$$) 13,841 (85) Graduate Tuition ($$\tau_1+\tau_2$$) 70,970 (869) Disutility of school reentry ($$\delta_1^S$$) 29,009 (207) 37,357 (597) Home utility parameters Children ($$\delta_1^H$$) $$-$$1,799 (47) 3,626 (75) Trend ($$\delta_2^H$$) 62.92 (0.83) 53.77 (0.55) Standard dev. of i.i.d. shocks School ($$\sigma^S$$) 1,150 (8) 215 (2) Home ($$\sigma^H$$) 10,227 (638) 5,316 (243) Elast. of substit. param. BC vs WC vs Factor share parameters D. Production function Eq. ($$\rho$$) Eq. ($$\gamma$$) Struct. ($$\lambda$$) BC ($$\alpha$$) WC ($$\theta$$) 0.288 $$-$$0.059 0.073 0.555 0.452 (0.006) (0.005) (0.011) (0.007) (0.010) Elast. of substit. param. BC vs WC vs Factor share parameters D. Production function Eq. ($$\rho$$) Eq. ($$\gamma$$) Struct. ($$\lambda$$) BC ($$\alpha$$) WC ($$\theta$$) 0.288 $$-$$0.059 0.073 0.555 0.452 (0.006) (0.005) (0.011) (0.007) (0.010) E. Aggregate shock process Constant ($$\phi_0$$) Autoregressive term ($$\phi_1$$) St. dev. of innovations ($$\sigma_z$$) 0.001 0.328 0.026 (0.003) (0.114) (0.021) E. Aggregate shock process Constant ($$\phi_0$$) Autoregressive term ($$\phi_1$$) St. dev. of innovations ($$\sigma_z$$) 0.001 0.328 0.026 (0.003) (0.114) (0.021) The table presents parameter estimates for Equations (3.2)–(3.7). Native male constants for wage equations are normalized to zero. Immigrant male and native female constants are estimated. The constant for a female immigrant from region $$i$$ is obtained as the sum of the constant for a region $$i$$ male immigrant and the difference between the constant for native females and native males. The individual subjective discount factor, $$\beta$$, is set to 0.95. Standard errors (calculated as described in Appendix B) are in parentheses. Standard errors for the aggregate shock process are regression standard errors. TABLE 4 Expectation rules for skill prices Blue-collar skill price White-collar skill price Coefficient estimates Constant ($$\eta_0$$) 0.002 (0.001) 0.002 (0.002) Autoregressive term ($$\eta_j$$) 0.324 (0.046) 0.367 (0.048) $$\Delta$$ Aggregate shock ($$\eta_z$$) 0.835 (0.046) 1.118 (0.065) R-squared goodness of fit measures Differences 0.870 0.858 Levels 0.999 0.999 Using predicted shock 0.221 0.222 Blue-collar skill price White-collar skill price Coefficient estimates Constant ($$\eta_0$$) 0.002 (0.001) 0.002 (0.002) Autoregressive term ($$\eta_j$$) 0.324 (0.046) 0.367 (0.048) $$\Delta$$ Aggregate shock ($$\eta_z$$) 0.835 (0.046) 1.118 (0.065) R-squared goodness of fit measures Differences 0.870 0.858 Levels 0.999 0.999 Using predicted shock 0.221 0.222 The table includes estimates for the coefficients of expectation rules for aggregate skill prices—Equation (3.9). Goodness of fit measures are reported in the bottom panel. These measures are computed for the prediction of differences and levels for $$j=B,W$$. The last one uses the predicted increase in the aggregate shock obtained from Equation (3.7) instead of the actual increase. Standard errors (in parenthesis) are regression standard errors, and do not account for the error in the estimation of fundamental parameters. TABLE 4 Expectation rules for skill prices Blue-collar skill price White-collar skill price Coefficient estimates Constant ($$\eta_0$$) 0.002 (0.001) 0.002 (0.002) Autoregressive term ($$\eta_j$$) 0.324 (0.046) 0.367 (0.048) $$\Delta$$ Aggregate shock ($$\eta_z$$) 0.835 (0.046) 1.118 (0.065) R-squared goodness of fit measures Differences 0.870 0.858 Levels 0.999 0.999 Using predicted shock 0.221 0.222 Blue-collar skill price White-collar skill price Coefficient estimates Constant ($$\eta_0$$) 0.002 (0.001) 0.002 (0.002) Autoregressive term ($$\eta_j$$) 0.324 (0.046) 0.367 (0.048) $$\Delta$$ Aggregate shock ($$\eta_z$$) 0.835 (0.046) 1.118 (0.065) R-squared goodness of fit measures Differences 0.870 0.858 Levels 0.999 0.999 Using predicted shock 0.221 0.222 The table includes estimates for the coefficients of expectation rules for aggregate skill prices—Equation (3.9). Goodness of fit measures are reported in the bottom panel. These measures are computed for the prediction of differences and levels for $$j=B,W$$. The last one uses the predicted increase in the aggregate shock obtained from Equation (3.7) instead of the actual increase. Standard errors (in parenthesis) are regression standard errors, and do not account for the error in the estimation of fundamental parameters. 5.1.1. Fundamental parameters of the model Table 3 presents estimates for the fundamental parameters of the model. Standard errors, in parentheses, account for both sampling and simulation error, as detailed in Appendix B. Panel A presents estimates for the gender$$\times$$origin constants for each alternative. There are substantial differences in preferences and productivity between immigrants and natives, and, among immigrants, by national origin. Latin American immigrants have a comparative advantage in blue-collar jobs, whereas Asian/African and Western immigrants have comparative advantage in white-collar. Given the change in the national origin composition of immigrant inflows in recent decades, these differences can explain the increasing concentration of immigrants in blue-collar occupations. Western and Asian/African immigrants also have a stronger preference for education, which makes them more likely to enroll. Returns to education in blue-collar jobs, in Panel B, are smaller for immigrants than for natives ($$5.8$$% versus $$7.2$$% per extra year of education), while they are remarkably similar for white-collar ($$10.9$$% versus $$11$$%). Foreign potential experience is positively rewarded in blue-collar occupations ($$1.7$$%) and negatively rewarded in white-collar jobs ($$-5.9$$%). All these differences make similar natives and immigrants to work in different occupations. Ottaviano and Peri (2012) find that observationally equivalent natives and immigrants are imperfect substitutes in production because they are employed in different occupations. Thus, the model endogenously generates such imperfect substitutability. These estimates have implications for the ability of the model in predicting different regularities about immigration that have been established in the literature. LaLonde and Topel (1992), Borjas (1985, 1995, 2015), and Lubotsky (2007) for the U.S., Dustmann and Preston (2012) for the U.K. and the U.S., and Eckstein and Weiss (2004) for Israel show that immigrants assimilate as they spend time in the destination country. As returns to foreign potential experience are smaller than those to domestic experience both in blue-collar and in white-collar jobs, immigrants assimilate in this model, because between two observationally equivalent immigrants, the one that spent more years in the U.S. earns more. Borjas (1985, 1995, 2015) noted that the relative quality of immigrant cohorts decreased in recent years. In the model, the change in the relative importance of Latin American aliens in the recent cohorts fosters immigrant concentration in blue-collar jobs, and leads to a decrease in average immigrant productivity, which is consistent with these findings. Finally, Dustmann et al. (2013) and Dustmann and Preston (2012) find evidence of immigrants downgrading upon arrival in the destination country. The comparative advantage of some groups immigrants in blue-collar jobs and the differences in productivity with respect to natives generate this phenomenon. This is the case, for example, of Latin American immigrants. Other groups like the highly educated Western immigrants that arrive in the U.S. straight after attending school in their home country are more likely enroll in school, work in white-collar jobs, and make higher wages in the U.S. than comparable natives, which would be a form of upgrading. One of the most important differences between the production function in Equation (3.6) and the nested-CES production function used in the immigration literature (Borjas, 2003; Ottaviano and Peri, 2012) is that Equation (3.6) allows for capital-skill complementarity and skill-biased technical change. Elasticities of substitution implied by the estimates of $$\rho$$ and $$\gamma$$ are respectively $$1.40$$ and $$0.94$$, very much in line with Krusell et al. (2000).22 These elasticities indicate that equipment capital and white-collar labour are relative complements. This capital-skill complementarity links the fast accumulation of equipment capital and the increase in the white-collar/blue-collar wage gap, as shown in Equation (3.13). Several papers have tested capital-skill complementarity with different data since the seminal work by Griliches (1969). Most of them agree in the existence of some degree of complementarity between capital and skilled labour, even though there is a variety of estimates for the elasticities of substitution (Hamermesh, 1986). The remaining parameters of the model, which are crucial for the model to fit choices, wages, and transitions observed in the data, are also reasonable and in line with the literature. Women are less productive than men in both occupations (to a larger extent in blue-collar), obtain a larger utility from attending school, and a smaller utility from staying at home. This is consistent with the observed wage gap, enrollment rates, and female concentration in white-collar occupations. They also obtain a higher boost in home utility when having preschool children, which fits their larger propensity to drop from the labour market for childbearing. And their reentry costs both to labour market and to school are larger than male counterparts, which, together with a lower variance in the home decision, makes them less likely to transit in and out from non-employment (compared to men). Estimated returns to education fit within the variety of results surveyed by Card (1999), which range from 5% to 15% with most of the estimated causal effects clustering between 9% and 11%. Results are also qualitatively in line with (although somewhat larger than) Keane and Wolpin (1997), Lee (2005), and Lee and Wolpin (2006). Compared to own experience, returns to cross experience are much lower, flatter, and turn negative after a certain level. Along with the positive correlation between blue-collar and white-collar idiosyncratic shocks, this is important to fit observed transitions across occupations. This leaves some degrees of freedom for the variance of the idiosyncratic shock to capture the observed variance in wages. Likewise, the estimated cost for reentering school is quite large ($$29,009$$US $\$$ and $$37,357$$US $\$$ for male and female), which is in line with the observation that very few individuals return to school after leaving it. And the estimated labour market reentry cost is close to nine thousand US $\$$ for males, and above twelve thousand for females, one quarter and almost one half of the average full-time equivalent annual wage for males and females respectively. All this provides a good fit of wages and transitions across alternatives, as discussed in the next section. Yet, in reality there could be some permanent unobserved factor (ability or taste heterogeneity), not included in the model, that makes individuals more likely to persist in their choices. Omitting such heterogeneity could lead to an overstatement of the other factors that drive persistence in this model. In terms of the estimated parameters, this would potentially imply underestimating cross-experience effects, overestimating transition costs, and/or overestimating returns to own experience. Furthermore, if these unobservables include ability, and high ability individuals are more likely to educate and to earn higher wages for a given educational level, this omission could also induce an overestimation of the returns to education. 5.1.2. Expectation rules Table 4 presents the equilibrium values of $$\eta$$ in Equation (3.9). The growth rate of the aggregate shock seems the most important piece to explain variation in skill prices. White collar skill prices react more to shocks than blue-collar prices. Estimates also show some state-dependence, and a small positive trend that adds to the one included in the aggregate shock. The selection of these particular rules as an approximation to rational expectations balanced a trade-off between simplicity and goodness of the approximation.23 The bottom panel of the table includes three different $$R^2$$ measures that summarize the explanatory power of these rules. The first one, a standard $$R^2$$ for the model in differences, indicates that the rules are able to predict more than $$85$$% of the variation in growth rates of skill prices. The second one measures the goodness of the rules in fitting the variation in levels, displaying almost a perfect fit. This large explanatory power, however, does not imply that individuals have perfect foresight of future skill prices, as they do not observe $$z_{t+1}$$ in period $$t$$. Accounting for that, the third measure replaces $$\Delta\ln z_{t+1}$$ by its prediction from Equation (3.7). Results suggest that individuals are only able to forecast around a $$22$$% of the variation in (the growth rate of) skill prices one period ahead, which is far from perfect foresight. 5.2. Model fit In this section, I compare predicted and actual values of the most relevant aggregates for individuals aged 25–54 years to evaluate the goodness of fit of the estimated model. I focus on this age range because it is the one for which I compare baseline and counterfactual outcomes in Section 6. Figure 2 compares actual and predicted aggregate statistics, both for male and female. Panel A includes average years of schooling.24 The model predicts well both the level and the change in education over the sample period for males. For females, it also fits very accurately the increase observed in the data (around 2.5 years), but somewhat under-predicts the level (by around a third of a year).25 Panels B and C compare actual and predicted labour force participation and fraction of employees working in blue-collar occupations respectively. The model fit of these dimensions is remarkable. It accurately reproduces the participation level, the increase in female labour force participation, the fraction of individuals working in blue-collar occupations, and the gender gaps in the two variables. Panels D and E evaluate the goodness of the model in fitting the distribution of experience in the NLSY samples. For individuals in the NLSY79 (Panel D), experience is measured around 1993, when individuals are aged around 30 years. For the NLSY97 sample (Panel E), it is measured around 2006, with individuals aged around 25 years. In general, the model provides a good fit of these distributions. Panels F through H show the model fit for wages. The model displays a remarkable fit of female average wages (trend and level), the level of male wages and, hence, the gender gap, the college to high school (except the trend in the last few years, and the level for women in early years), and the trend and level of white-collar to blue-collar wage gaps for male, and the trend in these gaps for female. It is unable to replicate the hump shape in the evolution of male wages observed between 1970 and 1990. This could be because of the rather parsimonious parametrization of the aggregate production function, or of not allowing the returns to skills to vary over such a long period. The model also under-predicts the level of the college and white-collar wage gaps for female, as noted in Footnote 25. Figure 2 View largeDownload slide Actual and predicted aggregates Notes: Dark solid (with crosses): data, male. Light solid (with circles): data, female. Dark dashed: simulations, male. Light dashed: simulations, female. Panels A, B, C, F, G, and H are computed for individuals aged 25–54 years; actual data for these plots is obtained from March Supplements of the CPS (survey years from 1968 to 2008). In Panels D and E, experience is counted around 1993 (D) and (2006) for individuals in each cohort; sources for actual data in these plots are NLSY79 and NLSY97 as indicated. Figure 2 View largeDownload slide Actual and predicted aggregates Notes: Dark solid (with crosses): data, male. Light solid (with circles): data, female. Dark dashed: simulations, male. Light dashed: simulations, female. Panels A, B, C, F, G, and H are computed for individuals aged 25–54 years; actual data for these plots is obtained from March Supplements of the CPS (survey years from 1968 to 2008). In Panels D and E, experience is counted around 1993 (D) and (2006) for individuals in each cohort; sources for actual data in these plots are NLSY79 and NLSY97 as indicated. The fit of the model in terms of transition probabilities is evaluated in Table 5. The table presents actual and predicted transition probability matrix from blue-collar, white-collar, and home alternatives into blue-collar, white-collar, school, and home.26 Transitions from the three alternatives are extremely well replicated by the model. In particular, the model captures very well the persistence in each of the alternatives, occupational switches, the fact that individuals rarely go back to school after leaving it, and transitions back and forth from working to home. TABLE 5 Actual vs predicted transition probability matrix Choice in $$t$$ Blue collar White collar School Home Choice in $$t-1$$ Act. Pred. Act. Pred. Act. Pred. Act. Pred. Blue collar 0.75 0.77 0.11 0.10 0.00 0.00 0.14 0.13 White collar 0.06 0.07 0.83 0.83 0.00 0.00 0.10 0.10 Home 0.11 0.08 0.13 0.13 0.01 0.01 0.76 0.79 Choice in $$t$$ Blue collar White collar School Home Choice in $$t-1$$ Act. Pred. Act. Pred. Act. Pred. Act. Pred. Blue collar 0.75 0.77 0.11 0.10 0.00 0.00 0.14 0.13 White collar 0.06 0.07 0.83 0.83 0.00 0.00 0.10 0.10 Home 0.11 0.08 0.13 0.13 0.01 0.01 0.76 0.79 The table includes actual and predicted one-year transition probability matrix from blue-collar, white-collar, and home (rows) into blue-collar, white-collar, school, and home (columns) for individuals aged 25–54 years. Actual and predicted probabilities in each row add up to one. Actual data is obtained from one-year matched March Supplements of the CPS (survey years from 1968 to 2008). TABLE 5 Actual vs predicted transition probability matrix Choice in $$t$$ Blue collar White collar School Home Choice in $$t-1$$ Act. Pred. Act. Pred. Act. Pred. Act. Pred. Blue collar 0.75 0.77 0.11 0.10 0.00 0.00 0.14 0.13 White collar 0.06 0.07 0.83 0.83 0.00 0.00 0.10 0.10 Home 0.11 0.08 0.13 0.13 0.01 0.01 0.76 0.79 Choice in $$t$$ Blue collar White collar School Home Choice in $$t-1$$ Act. Pred. Act. Pred. Act. Pred. Act. Pred. Blue collar 0.75 0.77 0.11 0.10 0.00 0.00 0.14 0.13 White collar 0.06 0.07 0.83 0.83 0.00 0.00 0.10 0.10 Home 0.11 0.08 0.13 0.13 0.01 0.01 0.76 0.79 The table includes actual and predicted one-year transition probability matrix from blue-collar, white-collar, and home (rows) into blue-collar, white-collar, school, and home (columns) for individuals aged 25–54 years. Actual and predicted probabilities in each row add up to one. Actual data is obtained from one-year matched March Supplements of the CPS (survey years from 1968 to 2008). The formal discussion on identification in Section 4.1, the discussion of the parameter estimates in Section 5.1, and the results in Figure 2 and Table 5 provide some evidence that the model presented in this article and the variation used to identify its parameters are meaningful. Yet, it is reassuring to explore further evidence in the same direction. The remainder of this section presents six additional exercises that provide further validation of this conclusion. First, Table 6 analyses the goodness of the model in predicting immigrant choices out-of-sample. As noted in Section 4.3, whether a person is an immigrant or not is only identifiable in the CPS starting in 1993. Thus, no separate information for natives and immigrants before 1993 is used in the estimation. Given that the immigrant group is too small to drive the main aggregate trends (the percentage of immigrants in the population of working-age is below 10%), and that natives and immigrants had very different trends in education and choices over the period, correctly fitting these trends would provide evidence that individual choices are well identified, at least for immigrants. Table 6 evaluates the goodness of the model on fitting education, participation, and occupational choice of immigrants in census years 1970, 1980, and 1990. To do so, it compares predicted values from the model to data from the U.S. Census microdata samples, which are not used in the estimation. As it emerges from the table, the model does a good job in predicting levels, trends, and gender gaps for the different aggregates. TABLE 6 Out of sample fit: act. versus pred. statistics for immigrants Out-of-sample In-sample 1970 1980 1990 1993–2007 Act. Pred. Act. Pred. Act. Pred. Act. Pred. A. Male Share with high school or less 0.67 0.69 0.57 0.61 0.52 0.55 0.55 0.56 Average years of education 10.8 11.1 11.4 11.8 11.7 12.1 11.9 12.1 Participation rate 0.77 0.56 0.68 0.61 0.63 0.66 0.75 0.72 Share of workers in blue-collar 0.57 0.57 0.55 0.54 0.53 0.51 0.58 0.51 B. Female Share with high school or less 0.78 0.78 0.68 0.69 0.56 0.58 0.54 0.53 Average years of education 10.3 10.8 10.9 11.5 11.5 12.1 12.0 12.5 Participation rate 0.32 0.25 0.36 0.31 0.41 0.40 0.49 0.52 Share of workers in blue-collar 0.46 0.45 0.45 0.44 0.39 0.43 0.41 0.43 Out-of-sample In-sample 1970 1980 1990 1993–2007 Act. Pred. Act. Pred. Act. Pred. Act. Pred. A. Male Share with high school or less 0.67 0.69 0.57 0.61 0.52 0.55 0.55 0.56 Average years of education 10.8 11.1 11.4 11.8 11.7 12.1 11.9 12.1 Participation rate 0.77 0.56 0.68 0.61 0.63 0.66 0.75 0.72 Share of workers in blue-collar 0.57 0.57 0.55 0.54 0.53 0.51 0.58 0.51 B. Female Share with high school or less 0.78 0.78 0.68 0.69 0.56 0.58 0.54 0.53 Average years of education 10.3 10.8 10.9 11.5 11.5 12.1 12.0 12.5 Participation rate 0.32 0.25 0.36 0.31 0.41 0.40 0.49 0.52 Share of workers in blue-collar 0.46 0.45 0.45 0.44 0.39 0.43 0.41 0.43 The table presents actual and predicted values of the listed aggregates for immigrants. Statistics for 1993–2007 are obtained from March Supplements of the CPS, and are used in the estimation. Data for 1970, 1980, and 1990 are from U.S. Census microdata samples and not used in the estimation. TABLE 6 Out of sample fit: act. versus pred. statistics for immigrants Out-of-sample In-sample 1970 1980 1990 1993–2007 Act. Pred. Act. Pred. Act. Pred. Act. Pred. A. Male Share with high school or less 0.67 0.69 0.57 0.61 0.52 0.55 0.55 0.56 Average years of education 10.8 11.1 11.4 11.8 11.7 12.1 11.9 12.1 Participation rate 0.77 0.56 0.68 0.61 0.63 0.66 0.75 0.72 Share of workers in blue-collar 0.57 0.57 0.55 0.54 0.53 0.51 0.58 0.51 B. Female Share with high school or less 0.78 0.78 0.68 0.69 0.56 0.58 0.54 0.53 Average years of education 10.3 10.8 10.9 11.5 11.5 12.1 12.0 12.5 Participation rate 0.32 0.25 0.36 0.31 0.41 0.40 0.49 0.52 Share of workers in blue-collar 0.46 0.45 0.45 0.44 0.39 0.43 0.41 0.43 Out-of-sample In-sample 1970 1980 1990 1993–2007 Act. Pred. Act. Pred. Act. Pred. Act. Pred. A. Male Share with high school or less 0.67 0.69 0.57 0.61 0.52 0.55 0.55 0.56 Average years of education 10.8 11.1 11.4 11.8 11.7 12.1 11.9 12.1 Participation rate 0.77 0.56 0.68 0.61 0.63 0.66 0.75 0.72 Share of workers in blue-collar 0.57 0.57 0.55 0.54 0.53 0.51 0.58 0.51 B. Female Share with high school or less 0.78 0.78 0.68 0.69 0.56 0.58 0.54 0.53 Average years of education 10.3 10.8 10.9 11.5 11.5 12.1 12.0 12.5 Participation rate 0.32 0.25 0.36 0.31 0.41 0.40 0.49 0.52 Share of workers in blue-collar 0.46 0.45 0.45 0.44 0.39 0.43 0.41 0.43 The table presents actual and predicted values of the listed aggregates for immigrants. Statistics for 1993–2007 are obtained from March Supplements of the CPS, and are used in the estimation. Data for 1970, 1980, and 1990 are from U.S. Census microdata samples and not used in the estimation. Second, Figure A1 in Appendix C gives a sense of whether the variation in the data is enough to identify the parameters. While the curvature of the minimum distance criterion function is difficult to represent in the multidimensional space, one can plot sections of it moving one of the parameters and leaving others fixed at the estimated values. Figure A1 provides these sections for each of the parameters of the model. As it emerges from the figure, all parameters move the criterion function substantially and have a clear minimum at the estimated value. Third, to further evaluate what variation identifies education decisions in practice, in Table 7 I present estimates of returns to education obtained from fitting OLS and Heckman (1979) selection-corrected regressions on actual and simulated data. All regressions control for gender, year, and potential experience (age minus education) dummies. For the selection correction models, I use dummies for the number of children as exclusion restrictions. Overall, estimates in actual and simulated data are remarkably similar, which suggests that parameters are effectively identified from the variation discussed in Section 4.1. Table 7 Estimated and simulated returns to education Data Simulation Least Squares (OLS) 0.096 (0.000) 0.096 (0.002) Selection-corrected (Heckman, 1979) 0.123 (0.001) 0.114 (0.005) Data Simulation Least Squares (OLS) 0.096 (0.000) 0.096 (0.002) Selection-corrected (Heckman, 1979) 0.123 (0.001) 0.114 (0.005) The table presents coefficients for years of education in OLS and Heckman (1979) selection-corrected regressions fitted on actual and simulated data. All regressions include dummies for potential experience (age minus education), gender, and year. In the selection-correction model, dummies for the number of children are included as exclusion restrictions. Actual data are obtained from the CPS. The sample period is 1967 to 2007. Random subsamples of 500,000 observations are drawn for both actual and simulated data. Nationally representative weights are used in the regressions.