TY - JOUR AU - Schnell,, Molly AB - Abstract We study the causes of “nutritional inequality”: why the wealthy eat more healthfully than the poor in the United States. Exploiting supermarket entry and household moves to healthier neighborhoods, we reject that neighborhood environments contribute meaningfully to nutritional inequality. We then estimate a structural model of grocery demand, using a new instrument exploiting the combination of grocery retail chains’ differing presence across geographic markets with their differing comparative advantages across product groups. Counterfactual simulations show that exposing low-income households to the same products and prices available to high-income households reduces nutritional inequality by only about 10%, while the remaining 90% is driven by differences in demand. These findings counter the argument that policies to increase the supply of healthy groceries could play an important role in reducing nutritional inequality. I. Introduction A wave of recent studies has drawn increased attention to the causes and consequences of socioeconomic inequality (Saez and Piketty 2003; Aizer and Currie 2014; Chetty et al. 2014, 2016; Case and Deaton 2015). This inequality plays out in many different ways, including educational opportunities, health outcomes, mass incarceration, and social networks. In this article, we study one additional correlate of socioeconomic status—what we eat and drink—and quantify the economic mechanisms that drive nutritional inequality. Because what we eat and drink is a key driver of obesity and other health outcomes, understanding why nutritional inequality exists is crucial for designing policies to address socioeconomic disparities in health. A large body of literature has documented that low-income neighborhoods are more likely to be “food deserts”—that is, areas with low availability or high prices of healthy foods.1 Many public health researchers, policy makers, and advocates further argue that food deserts are an important cause of unhealthy eating.2 Despite limited evidence supporting this causal claim, both the federal government and local municipalities spend millions of dollars a year on supply-side policies that subsidize and assist grocers in underserved areas.3 It is certainly possible that differential access to healthy foods is at least partially to blame for nutritional inequality. Bitler and Haider (2011) discuss how zoning restrictions, crime, and other factors could discourage entry by grocers that would sell healthy foods in low-income areas. Furthermore, results from the Moving To Opportunity experiment suggest that moving to lower-poverty neighborhoods reduces obesity (Kling, Liebman, and Katz 2007; Ludwig et al. 2011). On the other hand, it is natural to imagine that the observed supply differences across neighborhoods are simply equilibrium responses to differences in consumer preferences. Thus, teasing apart supply-side versus demand-side explanations is crucial for understanding the possible effects of supply-side policies. This article combines reduced-form analyses with a structural demand model to quantify the relative importance of local supply and demand factors in generating nutritional inequality. We exploit a rich combination of data sets, including Nielsen Homescan—a nationally representative panel survey of the grocery purchases of 61,000 households—and Nielsen’s Retail Measurement Services (RMS)—a national panel of UPC-level sales data from 35,000 stores covering about 40% of all U.S. grocery purchases. We match the Homescan data to surveys of panelists’ nutrition knowledge, preferences, and health outcomes gathered by Nielsen for Allcott, Lockwood, and Taubinsky (2019). Finally, we gather data on the entry dates and exact locations of all 6,721 new supermarkets that opened in the United States from 2004 to 2016, along with annual data on retail establishments in each ZIP code. We thus have an extraordinarily rich window into households’ choice sets, information sets, local environments, and resulting consumption and health. We begin by laying out two basic facts that motivate the debate on food deserts and the causes of nutritional inequality. First, there is a meaningful nutrition–income relationship: households in the top income quartile buy groceries that are 0.56 standard deviations more healthful than the bottom income quartile, as measured by our version of the Healthy Eating Index, the standard measure of dietary quality. Second, RMS stores in low-income neighborhoods offer less healthy groceries than stores in high-income neighborhoods. Low-income neighborhoods have more drug and convenience stores and fewer large supermarkets, which offer a wider variety of healthy options. We use two reduced-form event studies to test whether the local environment has an economically significant effect on healthy eating. The first study looks within households, before versus after the entry of a new supermarket nearby. In the full sample and the sample of households in food deserts, we find effects of supermarket entry on healthy eating that are occasionally statistically significant but always economically small. We document an intuitive reason for these small effects: while consumers shift their purchases toward the new entrants, these purchases are primarily substituted away from other supermarkets rather than drug stores and convenience stores that offer less healthy choice sets. Indeed, even households living in ZIP codes with no supermarkets still buy 85% of their groceries from supermarkets. We can bound the short-run effect of differential access to supermarkets at no more than about 1.5% of the nutrition–income relationship. Our second event study tests the hypothesis that a broader set of place-related factors, including peer effects or supply differences other than supermarket density, contribute to nutritional inequality. To do this, we exploit the fact that thousands of households move between ZIP codes or counties while in the Homescan panel. Before a move, households exhibit no trend in healthy eating. After a move, households converge toward eating patterns in the new location by an amount that is statistically significant but economically small. Any endogeneity in moving decisions likely biases these estimates upward. Although the panel is not long enough to study households for more than a few years after a move, we can bound the medium-term, partial equilibrium effects of place as contributing no more than 2%–3% of the nutrition–income relationship. To complement the reduced-form analyses, the second half of the article uses a structural model to estimate household grocery demand and carry out counterfactual simulations. Building on Dubois, Griffith, and Nevo (2014), we specify a utility function with constant elasticity of substitution preferences over individual products, Cobb-Douglas preferences for product groups (milk, bread, candy, vegetables, etc.), and linear preferences for observed characteristics (added sugar, salt, saturated fat, etc.) and unobserved characteristics. To address the identification challenge associated with the endogeneity of prices, we introduce a new instrument using the variation in prices generated by grocery retail chains’ differing comparative advantages in supplying different product groups combined with chains’ differing geographic presence across markets. To illustrate, suppose that there are two types of foods, apples and pizza, and two grocery chains, Safeway and Shaw’s. Suppose Safeway is able to supply pizza cheaply, and Shaw’s can supply apples cheaply. Then, cities dominated by Safeway will have relatively low prices for pizza, and cities dominated by Shaw’s will have relatively low prices for apples. Our key identifying assumption is that geographic variation in prices due to the presence of specific chains is independent of geographic variation in unobserved preferences. Consistent with this assumption, the instrument is uncorrelated with variation in preferences that can be predicted by consumer demographics, and the results change little when adding controls for geography and income that would capture unobserved preference heterogeneity. The instrument has a very strong first stage, and it may be useful for other researchers in similar settings. The estimates show a striking and systematic relationship between household income and preferences for healthy dietary characteristics. Higher-income households have stronger preferences for six out of the Healthy Eating Index’s eight “healthy” dietary components (whole fruit, other fruit, whole grains, green vegetables and beans, other vegetables, and dairy) and weaker preferences for two out of four “unhealthy” components (sodium and added sugar). These preference differences are economically significant: households in the bottom income quartile are willing to pay $0.43 per 1,000 calories to consume the bundle of dietary components that would receive the maximum Healthy Eating Index score instead of the minimum, whereas households in the top income quartile are willing to pay $1.14—almost three times as much. We find that about 20% of the income-related preference differences are econometrically explained by education and another 14% are explained by nutrition knowledge. We use the demand model to simulate counterfactuals in which bottom income quartile households are exposed to the prices and product availability experienced by top income quartile households. Only about 10% of the nutrition–income relationship is driven by these differences in supply, while 90% of the relationship is driven by differences in demand. Similar to our reduced-form analyses, these findings suggest that supply-side policy initiatives aimed at eliminating food deserts will have limited effects on healthy eating in disadvantaged neighborhoods. We also use the demand model to study alternative policies that do not focus solely on supply. Specifically, we simulate means-tested subsidies for healthy foods, which in principle could be implemented as part of the existing Supplemental Nutrition Assistance Program (SNAP). We find that it would cost $11 billion a year—or about 15% of the current SNAP budget—to fund a subsidy large enough to induce low-income households to purchase groceries as healthy as those purchased by high-income households. These results highlight the importance of research such as Bartlett et al. (2014) exploring whether SNAP should be modified to encourage healthy eating. Our article contributes a unified set of insights about why the wealthy and the poor eat differently in the United States. Our results add to findings on the impacts of proximity to supercenters (Courtemanche and Carden 2011; Volpe, Okrent, and Leibtag 2013; Courtemanche et al. 2018) and fast food restaurants (Davis and Carpenter 2009; Dunn 2010; Currie et al. 2010; Anderson and Matsa 2011), as well as case studies in the public health literature of individual grocery store entry (e.g. Wrigley, Warm, and Margetts 2003; Cummins et al. 2005; Song et al. 2009; Weatherspoon et al. 2012; Elbel et al. 2015).4 Our household migration event study adds a nutritional aspect to recent work that uses migration to understand the evolution of brand preferences (Bronnenberg, Dubé, and Gentzkow 2012), the caloric costs of culture (Atkin 2016), the effects of urban sprawl on obesity (Eid et al. 2008), and the drivers of geographic variation in health and health care utilization (Finkelstein, Gentzkow, and Williams 2016, 2018a, 2018b; Molitor 2018). Recent work by Hut (2018) finds similarly small short-run effects of migration on healthy eating but larger associations with local consumption patterns in the long run. Our structural demand analysis builds on the framework introduced by Dubois, Griffith, and Nevo (2014) and adds a novel identification strategy and price instrument. Finally, the decomposition of our preference estimates builds on work measuring correlates of health behaviors (Furnee, Groot, and Brink 2008; Cutler and Lleras-Muney 2010; Grossman 2015). Notably, the new Homescan add-on survey provides a remarkable opportunity to connect large-sample scanner data to measures of health preferences and nutrition knowledge. Sections II through VIII present data, stylized facts, reduced-form empirical analyses, demand model setup, demand model estimation and results, counterfactual estimates, and the conclusion. All appendix material is in the Online Appendix. II. Data II.A. Nielsen Homescan and Retail Scanner Data We use the Nielsen Homescan Panel for 2004–2016 to measure household grocery purchases. Homescan includes about 169,000 unique households, of which we observe about 39,000 in each year for 2004–2006 and about 61,000 in each year for 2007–2016. Homescan households record UPCs of all consumer packaged goods they purchase from any outlet. We consider only food and drink purchases, and we further exclude alcohol and health and beauty products such as vitamins. Although comprehensive, the Homescan data are not without limitations. First, Homescan does not include data on food purchased away from home in establishments like restaurants.5 We therefore focus on explaining income-related differences in the take-home market (i.e., grocery purchases) instead of overall diets. Second, most households only record purchases of packaged items with UPCs, not nonpackaged groceries such as bulk produce and grains. For 2004–2006, however, the data include an 8,000-household “magnet” subsample that also recorded prices and weights of nonpackaged groceries. Online Appendix Figure A1 shows that about 60% of magnet households’ produce calories are from packaged goods that are observed in the full Homescan sample, and this proportion does not vary statistically by income. This suggests that the focus on packaged groceries does not significantly detract from our results, both because produce represents a small share of overall grocery purchases and because packaged produce is a significant and reasonably representative portion of produce purchases. Each year, Homescan households report demographic variables such as household income (in 16 bins), household composition, race, and the age, educational attainment, employment status, and weekly work hours for male and female household heads. For households with two household heads, we use the mean of the age, education, and employment variables observed for both heads. We combine calorie needs by age and gender as reported in the U.S. government’s Dietary Guidelines with Homescan household composition to get each household’s daily calorie need. Our household size variable measures the number of adult “equivalents” in the household, where children are scaled into adults by their calorie needs. In addition to the standard Homescan data, we observe self-reports of the importance of staying healthy, a detailed nutrition knowledge quiz, body mass index (weight/height2), and diabetes status from a Homescan PanelViews add-on survey carried out by Nielsen for Allcott, Lockwood, and Taubinsky (2019) in 2017. Table I, Panel A presents descriptive statistics for Homescan households. Unless otherwise stated, all Homescan results are weighted for national representativeness. Table I Descriptive Statistics Variable Mean Standard deviation Panel A: Nielsen Homescan households  Household income ($000s) 61.0 43.3  Years education 13.9 2.06  Age 52.3 14.4  Household size 2.38 1.33  White 0.77 0.42  Black 0.12 0.32  Married 0.50 0.50  Employed 0.61 0.44  Weekly work hours 22.9 16.7  Household daily calorie need 5,192 2,959  Health importance 0 1  Nutrition knowledge 0 1  Body mass index (kg/m2) 29.5 7.0  Diabetic 0.19 0.38 Panel B: ZIP code establishment counts  Grocery 1.67 4.07    Large grocery (> 50 employees) 0.46 1.05  Supercenters/club stores 0.11 0.40  Drug stores 1.10 2.31  Convenience stores 3.14 5.25  Meat/fish/produce stores 0.27 0.96 Variable Mean Standard deviation Panel A: Nielsen Homescan households  Household income ($000s) 61.0 43.3  Years education 13.9 2.06  Age 52.3 14.4  Household size 2.38 1.33  White 0.77 0.42  Black 0.12 0.32  Married 0.50 0.50  Employed 0.61 0.44  Weekly work hours 22.9 16.7  Household daily calorie need 5,192 2,959  Health importance 0 1  Nutrition knowledge 0 1  Body mass index (kg/m2) 29.5 7.0  Diabetic 0.19 0.38 Panel B: ZIP code establishment counts  Grocery 1.67 4.07    Large grocery (> 50 employees) 0.46 1.05  Supercenters/club stores 0.11 0.40  Drug stores 1.10 2.31  Convenience stores 3.14 5.25  Meat/fish/produce stores 0.27 0.96 Notes. Homescan data include 731,994 household-by-year observations for 2004–2016 and are weighted for national representativeness. The U.S. government Dietary Guidelines include calorie needs by age and gender; we combine that with Homescan household composition to get each household’s daily calorie need. Household size is the number of household heads plus the total calorie needs of all other household members divided by the nationwide average calorie need of household heads. Health importance, nutrition knowledge, body mass index, and the diabetic indicator are from Homescan add-on surveys carried out by Nielsen for Allcott, Lockwood, and Taubinsky (2019); the former two variables are normalized to have a mean of 0 and a standard deviation of 1. Health importance is the response to the question, “In general, how important is it to you to stay healthy, for example by maintaining a healthy weight, avoiding diabetes and heart disease, etc.?” Nutrition knowledge is from a battery of 28 questions drawn from the General Nutrition Knowledge Questionnaire (Kliemann et al. 2016). If two household members responded to the PanelViews survey, we take the mean of each survey variable across the two respondents. ZIP code establishment counts are from ZIP Code Business Patterns data for 2004–2016, with 508,951 ZIP code–by-year observations. Open in new tab Table I Descriptive Statistics Variable Mean Standard deviation Panel A: Nielsen Homescan households  Household income ($000s) 61.0 43.3  Years education 13.9 2.06  Age 52.3 14.4  Household size 2.38 1.33  White 0.77 0.42  Black 0.12 0.32  Married 0.50 0.50  Employed 0.61 0.44  Weekly work hours 22.9 16.7  Household daily calorie need 5,192 2,959  Health importance 0 1  Nutrition knowledge 0 1  Body mass index (kg/m2) 29.5 7.0  Diabetic 0.19 0.38 Panel B: ZIP code establishment counts  Grocery 1.67 4.07    Large grocery (> 50 employees) 0.46 1.05  Supercenters/club stores 0.11 0.40  Drug stores 1.10 2.31  Convenience stores 3.14 5.25  Meat/fish/produce stores 0.27 0.96 Variable Mean Standard deviation Panel A: Nielsen Homescan households  Household income ($000s) 61.0 43.3  Years education 13.9 2.06  Age 52.3 14.4  Household size 2.38 1.33  White 0.77 0.42  Black 0.12 0.32  Married 0.50 0.50  Employed 0.61 0.44  Weekly work hours 22.9 16.7  Household daily calorie need 5,192 2,959  Health importance 0 1  Nutrition knowledge 0 1  Body mass index (kg/m2) 29.5 7.0  Diabetic 0.19 0.38 Panel B: ZIP code establishment counts  Grocery 1.67 4.07    Large grocery (> 50 employees) 0.46 1.05  Supercenters/club stores 0.11 0.40  Drug stores 1.10 2.31  Convenience stores 3.14 5.25  Meat/fish/produce stores 0.27 0.96 Notes. Homescan data include 731,994 household-by-year observations for 2004–2016 and are weighted for national representativeness. The U.S. government Dietary Guidelines include calorie needs by age and gender; we combine that with Homescan household composition to get each household’s daily calorie need. Household size is the number of household heads plus the total calorie needs of all other household members divided by the nationwide average calorie need of household heads. Health importance, nutrition knowledge, body mass index, and the diabetic indicator are from Homescan add-on surveys carried out by Nielsen for Allcott, Lockwood, and Taubinsky (2019); the former two variables are normalized to have a mean of 0 and a standard deviation of 1. Health importance is the response to the question, “In general, how important is it to you to stay healthy, for example by maintaining a healthy weight, avoiding diabetes and heart disease, etc.?” Nutrition knowledge is from a battery of 28 questions drawn from the General Nutrition Knowledge Questionnaire (Kliemann et al. 2016). If two household members responded to the PanelViews survey, we take the mean of each survey variable across the two respondents. ZIP code establishment counts are from ZIP Code Business Patterns data for 2004–2016, with 508,951 ZIP code–by-year observations. Open in new tab The Nielsen RMS data consist of weekly prices and sales volumes for each UPC sold at approximately 42,000 unique stores from 160 retail chains for 2006–2016, of which we observe about 35,000 in each year. We exclude liquor stores. RMS includes 53%, 32%, 55%, and 2% of sales in the grocery, mass merchandiser, drug, and convenience store channels, respectively. As with Homescan, RMS does not include sales of bulk produce and other nonpackaged items. We deflate prices and incomes to 2010 dollars using the consumer price index for urban consumers for all items. For all empirical analyses other than the supermarket entry event study, we collapse data to the household (or store)-by-year level. II.B. Grocery Retail Establishments Studying the effects of retailer entry requires reliable data on store opening dates to avoid attenuation bias. Some data sets, such as InfoUSA and the National Establishment Time Series, might be useful for cross-sectional analyses, but they do not record the opening dates of new establishments with sufficient precision (see Bitler and Haider 2011, p. 162). We measure supermarket entry using Nielsen’s TDLinx data set, a geocoded census of all food retailers in the United States, including the month each store opened and its exact latitude and longitude. In validation checks, we found that TDLinx data closely match entry dates and locations for 1,914 supermarkets for which we have the exact address and opening date from four retailers’ administrative records. We include only club stores, supercenters, and grocery stores (excluding “superettes”), and we further restrict to entries for which the retailer could be matched to a retailer code in the Homescan data, which excludes small “mom and pop” groceries.6 For simplicity, we call the set of included stores “supermarkets.” There were 6,721 entries of supermarkets in the United States between 2004 and 2016. We also use ZIP Code Business Patterns (ZBPs), which gives a count of establishments by NAICS code and employment size category for every ZIP code as of March 10 each year. The ZBP data are drawn from tax records, the U.S. Census Company Organization Survey, and other administrative data. Table I, Panel B presents descriptive statistics for the ZBP data. II.C. Nutrition Facts and the Health Index Our nutrition facts are from the Food and Nutrient Database for Dietary Studies and the National Nutrient Database for Standard Reference available from the U.S. Department of Agriculture (USDA; U.S. Department of Agriculture, Agricultural Research Service 2016, 2018). We match these nutrition facts to UPCs using crosswalks developed by the USDA (Carlson et al. 2019). The UPC-level USDA nutrition facts closely match those from a marketing data provider that we used in a previous version of this article. To characterize goods and preferences on a single index of healthfulness per calorie, we use a slightly modified version of the standard dietary quality measure in the United States: the USDA’s Healthy Eating Index (HEI; Guenther, Reedy, and Krebs-Smith 2008; Volpe and Okrent 2012). The HEI scores diets on a scale from 0 to 100, adding points for consumption of “healthy” components (fruits, vegetables, whole grains, dairy, and proteins) and subtracting points for consumption of “unhealthy” components (refined grains, sodium, saturated fats, and added sugars). The HEI is usually used to score one person's full diet, and it is nonlinear in its components. For example, added sugar consumption reduces the HEI linearly at a prescribed “slope” until added sugars reach a threshold of 26% of calories, after which the HEI is unchanged. These nonlinearities are debated by nutritionists and are inappropriate for many of our analyses, in which we observe partial diets for entire households (in Homescan) or purchases by many consumers (in store-level RMS data). We thus construct a linearized version of the HEI that continues scoring nutrients with the prescribed slope, regardless of whether consumption is beyond a minimum or maximum threshold.7 We find that the linearization makes little difference: the correlation between “true” HEI and linearized HEI is 0.91 in our household-by-year Homescan data. For ease of interpretation, we normalize our linearized HEI to have a mean of 0 and a standard deviation of 1 across households. We call this linearized, normalized HEI the Health Index. Online Appendix Table A1 shows that the strongest correlates of the Health Index in household-by-year Homescan data are purchases of fruits and vegetables (with correlation coefficients of ρ ≈ 0.4 to 0.6), whole grains (ρ ≈ 0.50), sea and plant protein (ρ ≈ 0.64), added sugar (ρ ≈ −0.41), and solid fats (ρ ≈ −0.44). Online Appendix Table A2 shows that both the “true” HEI and our linearized HEI are highly correlated with Homescan panelists’ BMI and diabetes status. III. Stylized Facts: Purchases and Supply of Healthful Foods III.A. Purchase Disparities: The Nutrition–Income Relationship We begin by using the Homescan data to provide basic facts on socioeconomic disparities in dietary quality. Even basic stylized facts from these data are important: although there is a large body of literature on social class and diet quality, most previous work uses either data sets that only cover a few states or municipalities or national data sets such as NHANES that are an order of magnitude smaller than Homescan (Darmon and Drewnoski 2008). For comparability to other work on inequality, we proxy for socioeconomic status (SES) using household income. Because household income varies with household size, age, and survey year for reasons unrelated to household SES, we control for these factors in our analyses. Figure I presents binned scatterplots of how dietary quality varies with household income, residual of household size and age and year indicators. We use three illustrative measures—grams of added sugar per 1,000 calories purchased, share of bread calories from whole-grain breads, and share of total calories from packaged produce—and a summary measure, the Health Index across all grocery purchases. These four measures paint a consistent picture: low-income households purchase less healthful foods.8 Figure I Open in new tabDownload slide Healthfulness of Grocery Purchases by Household Income This figure presents Nielsen Homescan data for 2004–2016. Each panel presents a binned scatterplot of a grocery healthfulness measure against average household income across all years the household is observed in Homescan, residual of age and year indicators and household size. Added sugar is the grams of added sugar per 1,000 calories purchased; whole grain is the calorie-weighted average share of bread, buns, and rolls purchases that are whole grain; produce is the share of calories from fresh, canned, dried, and frozen fruits and vegetables; and the Health Index is our overall measure of the healthfulness of grocery purchases, normalized to have a mean of 0 and a standard deviation of 1 across households. Observations are weighted for national representativeness. Figure I Open in new tabDownload slide Healthfulness of Grocery Purchases by Household Income This figure presents Nielsen Homescan data for 2004–2016. Each panel presents a binned scatterplot of a grocery healthfulness measure against average household income across all years the household is observed in Homescan, residual of age and year indicators and household size. Added sugar is the grams of added sugar per 1,000 calories purchased; whole grain is the calorie-weighted average share of bread, buns, and rolls purchases that are whole grain; produce is the share of calories from fresh, canned, dried, and frozen fruits and vegetables; and the Health Index is our overall measure of the healthfulness of grocery purchases, normalized to have a mean of 0 and a standard deviation of 1 across households. Observations are weighted for national representativeness. To quantify this nutrition–income relationship, we calculate the difference in overall dietary quality between households in the highest and the lowest income quartiles. Conditional on household size and age and year indicators, the top income quartile buys groceries with a Health Index that is 0.56 standard deviation higher than the bottom income quartile. A key objective in the rest of the article is to explain this 0.56 standard deviation difference.9 We can benchmark the potential impacts of these differences in dietary quality using relationships between dietary quality and health outcomes, with the important caveat that these relationships are from correlation studies and thus may not be causal. For example, Figure I illustrates that households in the bottom income quartile purchase approximately 9.7 more grams of added sugar per 1,000 calories than households in the top quartile. The estimates in Yang et al. (2014) imply that 9.7 fewer grams of added sugar per 1,000 calories is conditionally associated with a 26% decrease in death rates from cardiovascular disease. Similarly, the conditional correlations in our Online Appendix Table A2 imply that the 0.56 standard deviation difference in the Health Index is conditionally associated with a 0.67 difference in BMI (0.09 standard deviations) and a 1.9 percentage point difference in diabetes prevalence (11%). III.B. Supply Disparities by Neighborhood Income Having documented socioeconomic disparities in consumption, we now provide basic facts on how the supply of healthy groceries varies between high- and low-income neighborhoods. The four panels of Figure II present the relationship between ZIP code median income and the healthfulness of items offered in RMS stores, again measured by added sugar content, the share of bread UPCs that are whole grain, the share of all UPCs that are packaged produce, and the mean Health Index of UPCs offered. The measures used in this figure weight UPCs only by calories in the package and not by quantity sold, so this figure reflects choice sets, not consumption. All four panels show the same qualitative result: stores in higher-income ZIP codes offer healthier items.10 Figure II Open in new tabDownload slide Store Average Healthfulness and Size by ZIP Code Median Income This figure uses Nielsen RMS data for 2006–2016. To measure store healthfulness, we construct the following across all UPCs offered in each store: average grams of added sugar per 1,000 calories; the calorie-weighted share of bread, buns, and rolls UPCs that are whole grain; the calorie-weighted share of UPCs that are produce; and the calorie-weighted mean Health Index. This figure presents the means of these variables across stores within categories of ZIP code median income. Figure II Open in new tabDownload slide Store Average Healthfulness and Size by ZIP Code Median Income This figure uses Nielsen RMS data for 2006–2016. To measure store healthfulness, we construct the following across all UPCs offered in each store: average grams of added sugar per 1,000 calories; the calorie-weighted share of bread, buns, and rolls UPCs that are whole grain; the calorie-weighted share of UPCs that are produce; and the calorie-weighted mean Health Index. This figure presents the means of these variables across stores within categories of ZIP code median income. The store types we call “supermarkets”—large grocery stores, supercenters, and club stores—generally offer a wider variety of healthy items and packaged and bulk produce compared with small grocery stores, convenience stores, and drug stores.11 Using the ZBP data, Figure III shows that lower-income neighborhoods tend to have fewer supermarkets and more drug and convenience stores per capita. Although only 24% of population-weighted ZIP codes have no supermarket, 55% of population-weighted ZIP codes with median income below $25,000 have no supermarket.12 Figure III Open in new tabDownload slide Store Counts by ZIP Code Median Income This figure presents counts of stores per 1,000 residents for the average ZIP code in each income category, using 2004–2016 ZIP Code Business Patterns data. Large (small) grocers are defined as those with 50 or more (fewer than 50) employees. Figure III Open in new tabDownload slide Store Counts by ZIP Code Median Income This figure presents counts of stores per 1,000 residents for the average ZIP code in each income category, using 2004–2016 ZIP Code Business Patterns data. Large (small) grocers are defined as those with 50 or more (fewer than 50) employees. IV. Reduced-Form Analysis: Effect of Access on Consumption In the previous section, we documented that low-income households consume less healthy groceries and that low-income neighborhoods have less local supply of healthy foods. As described in the introduction, these two stylized facts have raised questions in policy circles about the extent to which limited availability of healthy foods causes nutritional inequality. Of course, supply and demand are determined in equilibrium. Neighborhood supply could also be correlated with demand due to simultaneity (where supply responds to demand, in addition to demand responding to supply) or because of other unobserved factors that systematically affect both supply and demand in low-income neighborhoods. To measure the elasticity of healthy-grocery demand with respect to local supply, we need quasi-exogenous variation in local supply. In this section, we use retail entry and household moves in event studies that look within households as their local retail environments change. IV.A. Effects of Supermarket Entry We begin by using an event study framework to measure the effects of supermarket entry on grocery purchases. Over our sample period, we observe 6,721 supermarket entries. Although the ideal experiment to measure the effects of store entry would randomly assign new supermarkets to different neighborhoods, stores enter for reasons that may create endogeneity concerns. In the analysis that follows, we include household fixed effects and measure how grocery consumption changes after a supermarket opens nearby. Although supermarkets often open and close in response to long-term changes in neighborhood composition, it seems implausible that supermarkets plan their openings for the exact time at which households will suddenly change their demand patterns. We consider the impact of supermarket entries that occur within a 0–10- or 10–15-minute drive of households’ census tract centroids.13 We compute driving times between each census tract centroid and the address of each entering supermarket using the Google Maps application program interface (API) and assuming no congestion delay. In our data, 66% of households experience zero entries and 19% of households experience only one entry within a 10-minute drive. For this analysis only, we use household-by-quarter data to exploit the precision we have in supermarket entry dates. Let Sbct be the count of supermarket entries that have occurred within driving time band b (where b = [0, 10) or [10, 15) minutes) of census tract c as of quarter t, and let |$\boldsymbol{X}_{it}$| denote the vector of potentially time-varying household covariates presented in Table I.14 Letting Yict denote an outcome for household i in census tract c in quarter t, we run the following regression in household-by-quarter Homescan data: $$\begin{equation} Y_{ict}=\sum _{b\in \lbrace [0,10),[10,15)\rbrace }\tau _{b}S_{bct}+\gamma \boldsymbol{X}_{it}+\mu _{d(c)t}+\phi _{ic}+\varepsilon _{ict}, \end{equation}$$ (1) where μd(c)t is a vector of census division-by-quarter indicators, and φic is a household-by-census tract fixed effect. As we study in Section IV.B, some Homescan households move while in the sample. Conditioning on φic isolates variation in supply due to entry, not household migration. Because the set of households exposed to local entry are already not nationally representative, we do not use the Homescan sample weights for this analysis. When estimating equations (1) and (2), we use robust standard errors with two-way clustering by household and census tract. Before estimating equation (1), we show graphical results of the event study. We define Ebcqt as an indicator variable denoting whether one supermarket had entered in distance band b of census tract c by q quarters after quarter t. Bbit is an indicator variable for whether observation it is part of a balanced panel around one supermarket entry in distance band b, meaning that the household is observed in the same census tract continuously for all four quarters before and all eight quarters after one supermarket entry and experienced only one entry in distance band b during that window. We run the following regression in household-by-quarter Homescan data: $$\begin{eqnarray} Y_{ict}&=&\sum _{b\in \lbrace [0,10),[10,15)\rbrace }\left[B_{bit}\left(\sum _{q}\tau _{bq}E_{bcqt}+\alpha _{b}\right)\right]\\ &&\nonumber+\,\gamma \boldsymbol{X}_{it}+\mu _{d(c)t}+\phi _{ic}+\varepsilon _{ict}. \end{eqnarray}$$ (2) The interaction with Bbit ensures that we identify τbq and αb using only the households in the balanced panel, although we include the full sample in the regression to improve the precision on the household covariates and fixed effects. The omitted category is q = −1, so all coefficients are relative to the outcome in the last quarter before entry. Figure IV presents the τ[0, 10)q coefficients and 95% confidence intervals. The panels on the left include the full sample, and the panels on the right include only the 23% of the sample living in ZIP codes with no supermarkets, which we call “food deserts.”15 Figure IV Open in new tabDownload slide Event Study of Supermarket Entry This figure presents the τ[0, 10)q parameters and 95% confidence intervals from estimates of equation (2)—the effects of supermarket entry—using 2004–2016 household-by-quarter Homescan data. All regressions control for household demographics (natural log of income, natural log of years of education, age indicators, household size, race indicators, a married indicator, employment status, and weekly work hours), census-division-by-quarter of sample indicators, and household-by-census-tract fixed effects. The top two panels present effects on expenditure shares (the share of all grocery expenditures recorded in Homescan, in units of percentage points) across all retailers with stores that have entered within a 15-minute drive of the household. The middle two panels present effects on the combined expenditure share of grocery stores, supercenters, and club stores. We keep the y-axis on the same scale between the top and middle panels so that the magnitudes can be compared easily. The bottom panels present effects on the Health Index, our overall measure of the healthfulness of grocery purchases which is normalized to have a mean of 0 and a standard deviation of 1 across households. The left panels include the full sample, and the right panels include only the “food desert” subsample: observations with no grocery stores with 50 or more employees, supercenters, or club stores in the ZIP code in the first year the household is observed there. Observations are not weighted for national representativeness. Figure IV Open in new tabDownload slide Event Study of Supermarket Entry This figure presents the τ[0, 10)q parameters and 95% confidence intervals from estimates of equation (2)—the effects of supermarket entry—using 2004–2016 household-by-quarter Homescan data. All regressions control for household demographics (natural log of income, natural log of years of education, age indicators, household size, race indicators, a married indicator, employment status, and weekly work hours), census-division-by-quarter of sample indicators, and household-by-census-tract fixed effects. The top two panels present effects on expenditure shares (the share of all grocery expenditures recorded in Homescan, in units of percentage points) across all retailers with stores that have entered within a 15-minute drive of the household. The middle two panels present effects on the combined expenditure share of grocery stores, supercenters, and club stores. We keep the y-axis on the same scale between the top and middle panels so that the magnitudes can be compared easily. The bottom panels present effects on the Health Index, our overall measure of the healthfulness of grocery purchases which is normalized to have a mean of 0 and a standard deviation of 1 across households. The left panels include the full sample, and the right panels include only the “food desert” subsample: observations with no grocery stores with 50 or more employees, supercenters, or club stores in the ZIP code in the first year the household is observed there. Observations are not weighted for national representativeness. The “entrant share” variable used in the top panels is the share of household i’s total grocery expenditures that are spent at any store of a retailer with a store entering within a 15-minute drive. The top left panel shows that this entrant expenditure share increases by about 2 percentage points one year after entry. In other words, a retail chain adding a new store earns an additional $2 for every $100 in grocery expenditures by nearby residents. The top right panel shows that the point estimates are larger in food deserts, rising closer to 3%. As a point of comparison, ZBP data show that the average household experiencing entry has 7.1 existing supermarkets across ZIP codes with centroids within a three-mile radius. If the typical local market is about this size, if all existing supermarkets are from different chains, and if the new entrant splits the market equally with its existing competitors, then the entrant retail chain would have a |$\frac{1}{8.1}$| market share, or about 12%. Our expenditure share coefficient is smaller than this hypothetical benchmark for two reasons. First, our results in Online Appendix Figure A6 (discussed below) suggest that the local market is larger than ZIP codes within a three-mile radius. Second, the effects on total expenditures across all stores owned by the entrant retailer likely understate expenditures at the specific entering store, as some expenditures may be diverted from the entrant retailer’s other stores.16 The middle panels in Figure IV show households’ expenditure shares at all grocery stores, supercenters, and club stores. We keep the y-axis scales the same for the top and middle panels so that the magnitudes can be easily compared. To the extent that supermarket entry simply diverts sales from other supermarkets that offer a similar variety of healthy groceries, the changes in healthy grocery availability—and thus the possible effects on healthful purchases—will be limited. Indeed, total expenditure shares across grocery stores, supercenters, and club stores increase by only a fraction of a percentage point in the full sample, with no statistically detectable effect in the food desert subsample. Thus, the primary effect of supermarket entry is to divert sales from other supermarkets. The bottom panels of Figure IV present results with the Health Index of purchased groceries as the dependent variable. Both panels show no detectable increase in healthy purchases after supermarket entry. In any given quarter, we can reject Health Index increases of more than about 0.02 (0.05) standard deviations in the full (food desert) sample. Table II presents estimates of equation (1) using the same dependent variables as in Figure IV. Panel A considers the effects on expenditure shares, first at entrant retailers and then across all grocery stores, supercenters, and club stores. Unsurprisingly, all effects are significantly larger for stores entering within 10 minutes of a household’s census tract centroid than for stores entering 10–15 minutes away. Columns (1) and (2) consider the full Homescan sample, columns (3) and (4) limit the sample to households in the bottom income quartile, and columns (5) and (6) limit to households in food deserts.17 Perhaps unsurprisingly, the expenditure share changes are generally larger for low-income households and households in food deserts. However, consistent with Figure IV, most or all of entrant chains’ expenditure share gains consist of diverted sales from other supermarkets.18Online Appendix Table A4 shows that even among households in food deserts, supermarket entry reduces expenditure shares at drug and convenience stores—which offer fewer healthy UPCs—by only a small fraction of a percentage point. Table II Effects of Supermarket Entry Full sample Bottom quartile Food deserts Entrants Grocery/super/club Entrants Grocery/super/club Entrants Grocery/super/club (1) (2) (3) (4) (5) (6) Panel A: Effects on expenditure shares Post entry: 0–10 minutes 1.496*** 0.037 1.966*** −0.034 1.914*** −0.269* (0.098) (0.051) (0.243) (0.145) (0.303) (0.159) Post entry: 10–15 minutes 0.543*** −0.057 0.433*** −0.029 0.762*** −0.038 (0.059) (0.035) (0.144) (0.094) (0.166) (0.119) Observations 2,874,514 2,874,365 538,041 537,998 646,223 646,181 Dependent var. mean 9.9 88.2 7.5 86.2 6.1 87.7 Full sample Bottom quartile Food deserts (1) (2) (3) Panel B: Effects on Health Index Post entry: 0–10 minutes 0.004 0.005 0.007 (0.003) (0.007) (0.008) Post entry: 10–15 minutes 0.006*** 0.001 0.014*** (0.002) (0.005) (0.005) Observations 2,874,514 538,041 646,223 Full sample Bottom quartile Food deserts Entrants Grocery/super/club Entrants Grocery/super/club Entrants Grocery/super/club (1) (2) (3) (4) (5) (6) Panel A: Effects on expenditure shares Post entry: 0–10 minutes 1.496*** 0.037 1.966*** −0.034 1.914*** −0.269* (0.098) (0.051) (0.243) (0.145) (0.303) (0.159) Post entry: 10–15 minutes 0.543*** −0.057 0.433*** −0.029 0.762*** −0.038 (0.059) (0.035) (0.144) (0.094) (0.166) (0.119) Observations 2,874,514 2,874,365 538,041 537,998 646,223 646,181 Dependent var. mean 9.9 88.2 7.5 86.2 6.1 87.7 Full sample Bottom quartile Food deserts (1) (2) (3) Panel B: Effects on Health Index Post entry: 0–10 minutes 0.004 0.005 0.007 (0.003) (0.007) (0.008) Post entry: 10–15 minutes 0.006*** 0.001 0.014*** (0.002) (0.005) (0.005) Observations 2,874,514 538,041 646,223 Notes. This table uses 2004–2016 Nielsen Homescan data at the household-by-quarter level. The “food desert” subsample comprises observations with no grocery stores with 50 or more employees, supercenters, or club stores in the ZIP code in the first year the household is observed there. Expenditure shares are the share of total grocery expenditures recorded in Homescan, in units of percentage points. Health Index is our overall measure of the healthfulness of grocery purchases, normalized to mean 0, standard deviation 1 across households. Reported independent variables are the count of supermarkets that have entered within a 0–10- or 10–15-minute drive from the household’s census tract centroid. All regressions control for household demographics (natural log of income, natural log of years of education, age indicators, household size, race indicators, a married indicator, employment status, and weekly work hours), census-division-by-quarter of sample indicators, and household-by-census-tract fixed effects. Observations are not weighted for national representativeness. Robust standard errors, clustered by household and census tract, are in parentheses. *, **, ***: statistically significant with 10%, 5%, and 1% confidence, respectively. Open in new tab Table II Effects of Supermarket Entry Full sample Bottom quartile Food deserts Entrants Grocery/super/club Entrants Grocery/super/club Entrants Grocery/super/club (1) (2) (3) (4) (5) (6) Panel A: Effects on expenditure shares Post entry: 0–10 minutes 1.496*** 0.037 1.966*** −0.034 1.914*** −0.269* (0.098) (0.051) (0.243) (0.145) (0.303) (0.159) Post entry: 10–15 minutes 0.543*** −0.057 0.433*** −0.029 0.762*** −0.038 (0.059) (0.035) (0.144) (0.094) (0.166) (0.119) Observations 2,874,514 2,874,365 538,041 537,998 646,223 646,181 Dependent var. mean 9.9 88.2 7.5 86.2 6.1 87.7 Full sample Bottom quartile Food deserts (1) (2) (3) Panel B: Effects on Health Index Post entry: 0–10 minutes 0.004 0.005 0.007 (0.003) (0.007) (0.008) Post entry: 10–15 minutes 0.006*** 0.001 0.014*** (0.002) (0.005) (0.005) Observations 2,874,514 538,041 646,223 Full sample Bottom quartile Food deserts Entrants Grocery/super/club Entrants Grocery/super/club Entrants Grocery/super/club (1) (2) (3) (4) (5) (6) Panel A: Effects on expenditure shares Post entry: 0–10 minutes 1.496*** 0.037 1.966*** −0.034 1.914*** −0.269* (0.098) (0.051) (0.243) (0.145) (0.303) (0.159) Post entry: 10–15 minutes 0.543*** −0.057 0.433*** −0.029 0.762*** −0.038 (0.059) (0.035) (0.144) (0.094) (0.166) (0.119) Observations 2,874,514 2,874,365 538,041 537,998 646,223 646,181 Dependent var. mean 9.9 88.2 7.5 86.2 6.1 87.7 Full sample Bottom quartile Food deserts (1) (2) (3) Panel B: Effects on Health Index Post entry: 0–10 minutes 0.004 0.005 0.007 (0.003) (0.007) (0.008) Post entry: 10–15 minutes 0.006*** 0.001 0.014*** (0.002) (0.005) (0.005) Observations 2,874,514 538,041 646,223 Notes. This table uses 2004–2016 Nielsen Homescan data at the household-by-quarter level. The “food desert” subsample comprises observations with no grocery stores with 50 or more employees, supercenters, or club stores in the ZIP code in the first year the household is observed there. Expenditure shares are the share of total grocery expenditures recorded in Homescan, in units of percentage points. Health Index is our overall measure of the healthfulness of grocery purchases, normalized to mean 0, standard deviation 1 across households. Reported independent variables are the count of supermarkets that have entered within a 0–10- or 10–15-minute drive from the household’s census tract centroid. All regressions control for household demographics (natural log of income, natural log of years of education, age indicators, household size, race indicators, a married indicator, employment status, and weekly work hours), census-division-by-quarter of sample indicators, and household-by-census-tract fixed effects. Observations are not weighted for national representativeness. Robust standard errors, clustered by household and census tract, are in parentheses. *, **, ***: statistically significant with 10%, 5%, and 1% confidence, respectively. Open in new tab Table II, Panel B presents effects on the Health Index. All of the six point estimates are positive, and in two cases they are statistically significant, but the effect sizes are economically small. Online Appendix Table A4 repeats these estimates using three alternative definitions of food deserts. Again, we find economically small, but in one case statistically significant, effects of supermarket entry on healthy eating. The insignificant effects are not due to limited power: with 60,000 households and all supermarket entries in the entire United States over a 13-year period, we are able to detect very small effects. We can use these estimates to determine the share of the nutrition–income relationship that can be explained by having more local supermarkets. Recall from Section III.A that households in the top income quartile buy groceries with a Health Index that is 0.56 standard deviations higher than those in the bottom quartile (conditional on household size and age and year indicators). The upper bound of the 95% confidence interval from Table II, Panel B, column (2) implies that one supermarket entry increases the Health Index by no more than about 0.02 standard deviations for bottom income quartile households. Using ZBP data, we calculate that high-income (low-income) Homescan households have an average of 2.47 (2.03) supermarkets in their ZIP code, which implies an average difference of 0.44 supermarkets. Thus, we can conclude that local access to supermarkets explains no more than 0.02 × |$\frac{0.44}{0.56}$| ≈ 1.5% of the Health Index difference between high- and low-income households. In short, differences in local access to supermarkets do not appear to be driving the nutrition–income relationship. Given the academic and policy attention to food deserts and local access to healthy groceries, it is remarkable that supermarket entry seems to matter so little for consumption. However, the limited impact of supermarket entry is less surprising in light of two key facts. First, using data from the 2009 National Household Travel Survey (NHTS)—a nationally representative survey that gathers demographics, vehicle ownership, and “trip diaries” from 150,000 households—we find that the average American travels 5.2 miles to shop, with 90% of shopping trips being made by car (see Online Appendix Figures A6 and A7). Although average distances are slightly shorter among low-income households (4.8 miles) and slightly longer among households living in food deserts (nearly 7 miles), the take-away remains the same: households are willing to travel long distances to purchase their groceries. Second, Online Appendix Figure A8 shows that as a result of this travel, households in food deserts spend only about 1% less of their grocery budgets at grocery stores, supercenters, and club stores than households that are not in food deserts. In summary, because most consumers already travel to shop in supermarkets, local supermarket entry does not significantly change choice sets and thus should not be expected to affect healthy eating. IV.B. “Place Effects” Identified by Movers Although we have shown that supermarket entry has little effect on healthy eating, a related hypothesis is that a broader class of “place effects” could drive grocery purchases. For instance, peer effects from the eating habits of friends and neighbors as well as general local knowledge and image concerns related to healthy eating could drive a household’s choices. Healthy eating patterns differ substantially across counties, both within and between regions of the country (see Online Appendix Figure A9). The Health Index of grocery purchases tends to be higher in urban areas and lower in the Southeast and parts of the mountain West. The county-level Health Index is highly correlated with county mean income (correlation coefficient ρ ≈ 0.42) and with Chetty et al. (2016)’s county-level life expectancy measure (ρ ≈ 0.61), underscoring both the inequities and the potential implications of what Americans eat and drink. Of course, this geographic variation could reflect any combination of causal place effects and geographic sorting of people with similar preferences. To test for place effects, we measure within-household changes in grocery purchases after the 20,031 cross–ZIP code moves and the 11,728 cross-county moves that Homescan households made during our sample period. While the ideal experiment to measure place effects would randomly assign households to different neighborhoods, households in our data move for reasons that may create endogeneity concerns. For example, Online Appendix Figure A12 and Online Appendix Table A6 show that moves to healthier counties (although not moves to healthier ZIP codes) are associated with increased household income. In what follows, we study how the estimates are affected by including controls for observed changes in income, job responsibilities, household composition, and marriage status that could generate endogeneity. While we cannot be certain, any remaining endogeneity would likely bias us toward finding place effects, because unobserved lifestyle changes and unreported salary increases that cause people to move to healthier places may also cause healthier eating. We therefore interpret the results in this section as likely upper bounds on true place effects. Define Hm as the Health Index of packaged groceries purchased in RMS stores in geographic area m, where m is either a ZIP code or a county.19 We estimate the following regression in household-by-year Homescan data: $$\begin{equation} Y_{imt}=\tau H_{m}+\gamma \boldsymbol{X}_{it}+\mu _{t}+\phi _{i}+\varepsilon _{imt}, \end{equation}$$ (3) where μt denotes year indicators, φi is a household fixed effect, and |$\boldsymbol{X}_{it}$| is again a vector of potentially time-varying household covariates described in Table I. We return to using annual data for this event study because Nielsen only reports the household’s location of residence as of the end of each year. By conditioning on household fixed effects, we isolate changes in grocery purchases associated with changes in neighborhood variables generated by household moves. Because most Homescan households are in the sample for only a few years, this within-household design only allows us to estimate place effects over the medium term—that is, a few years after a move. Because the set of movers is not nationally representative, we again do not use the Homescan sample weights for this analysis. When estimating equations (3) and (4), we use robust standard errors with two-way clustering by household and geographic area (ZIP code or county). Before estimating equation (3), we show graphical results of the event study. As before, let Bit be an indicator for whether observation it is part of a balanced panel around a move, meaning that the household is observed continuously from one year before the move to two years after and at least 50% of the household’s trips to RMS stores are to stores located in the household’s end-of-year county of residence in all three years other than the year of the move.20 These restrictions result in balanced panels that include 2,869 cross–ZIP code moves and 2,277 cross-county moves. Letting Δi denote the change in the average Health Index between a household’s final and original location, we estimate the following regression in household-by-year Homescan data: $$\begin{equation} Y_{it}=B_{it}\cdot \left(\alpha \Delta _{i}+\sum _{y}(\tau _{y}\Delta _{i}+\omega _{y})\right)+\gamma \boldsymbol{X}_{it}+\mu _{t}+\phi _{i}+\varepsilon _{it}, \end{equation}$$ (4) where y indexes years around the move, with the premove year y = −1 as the omitted category. The ωy coefficients are intercepts for each year, α measures the association between the household Health Index and the change in local environment in the year before the move (y = −1), and τy measures the difference in that association between y = −1 and each other year in the event study window. As in Section IV.A, the interaction with Bit means that we identify the coefficients of interest using only households in the balanced panel, although we include the full sample in the regression to improve the precision on the demographic associations γ, year effects μt, and household fixed effects φi. Figure V presents this event study analysis for cross-county moves.21 The top left panel shows the share of shopping trips to RMS stores that are in the new versus the old county for households with Bit = 1. For these households, almost all trips are in the old county before the move, and almost all trips are in the new county after. The top right panel presents the distribution of changes in the local Health Index Δi across household moves. The median cross-county mover experiences a local Health Index change of 0.13 standard deviations; this variation in Δi is what identifies τy. Figure V Open in new tabDownload slide Event Study of Moves across Counties Using 2004–2016 Homescan data, these figures present results for the event study of moves across counties. The top left panel presents the share of shopping trips that are in the new versus old county. The top right panel presents the distribution across balanced panel households of the difference in the Health Index between the new and old county. The bottom panels present the τy parameters and 95% confidence intervals from estimates of equation (4): associations between the household-level Health Index and the difference in the average local Health Index between postmove and premove locations. The bottom right panel includes controls for household demographics (natural log of income, natural log of years of education, age indicators, household size, race indicators, a married indicator, employment status, and weekly work hours). The Health Index is our overall measure of the healthfulness of grocery purchases and is normalized to have a mean of 0 and a standard deviation of 1 across households. Observations are not weighted for national representativeness. Figure V Open in new tabDownload slide Event Study of Moves across Counties Using 2004–2016 Homescan data, these figures present results for the event study of moves across counties. The top left panel presents the share of shopping trips that are in the new versus old county. The top right panel presents the distribution across balanced panel households of the difference in the Health Index between the new and old county. The bottom panels present the τy parameters and 95% confidence intervals from estimates of equation (4): associations between the household-level Health Index and the difference in the average local Health Index between postmove and premove locations. The bottom right panel includes controls for household demographics (natural log of income, natural log of years of education, age indicators, household size, race indicators, a married indicator, employment status, and weekly work hours). The Health Index is our overall measure of the healthfulness of grocery purchases and is normalized to have a mean of 0 and a standard deviation of 1 across households. Observations are not weighted for national representativeness. The bottom panels of Figure V show the estimated τy coefficients and 95% confidence intervals. The bottom left panel shows results excluding household demographics |$\boldsymbol{X}_{it}$|⁠, and the bottom right panel includes |$\boldsymbol{X}_{it}$|⁠. In both cases, there is no statistically significant postmove Health Index change associated with Δi, although the point estimates are positive. In other words, the figures show suggestive but insignificant evidence that households purchase more (less) healthy groceries when they move to counties where other households purchase more (less) healthy groceries. Table III presents estimates of equation (3). Columns (1) and (2) consider cross–ZIP code moves, and columns (3) and (4) consider cross-county moves. Sample sizes are slightly smaller in columns (1) and (2) because Hm is missing for ZIP codes with no RMS stores. In all four columns, |$\hat{\tau }$| is positive and statistically significant.22 Columns (2) and (4) include controls for household demographics |$\boldsymbol{X}_{it}$|⁠; including these demographics has very little impact on the results. However, adding demographic controls only slightly increases the regression R2, suggesting that unobserved within-household changes could be relevant (Oster 2019). Table III Association of Health Index with Local Area Health Index Using Movers (1) (2) (3) (4) ZIP code average Health Index 0.0511** 0.0487** (0.0247) (0.0245) County average Health Index 0.1067* 0.1100** (0.0565) (0.0560) Household demographics No Yes No Yes Observations 564,944 564,944 570,279 570,279 95% confidence interval upper bound 0.100 0.097 0.217 0.220 (1) (2) (3) (4) ZIP code average Health Index 0.0511** 0.0487** (0.0247) (0.0245) County average Health Index 0.1067* 0.1100** (0.0565) (0.0560) Household demographics No Yes No Yes Observations 564,944 564,944 570,279 570,279 95% confidence interval upper bound 0.100 0.097 0.217 0.220 Notes. This table uses 2004–2016 Nielsen Homescan data at the household-by-year level. The sample excludes observations where less than 50% of trips to RMS stores are not in the household’s end-of-year county of residence. The Health Index is our overall measure of the healthfulness of grocery purchases and is normalized to have a mean of 0 and a standard deviation of 1 across households. Household demographics are natural log of income, natural log of years of education, age indicators, household size, race indicators, a married indicator, employment status, and weekly work hours. All regressions also control for year indicators and household fixed effects. Observations are not weighted for national representativeness. Robust standard errors, clustered by household and local area (ZIP code or county), are in parentheses. *, **, ***: statistically significant with 10%, 5%, and 1% confidence, respectively. Open in new tab Table III Association of Health Index with Local Area Health Index Using Movers (1) (2) (3) (4) ZIP code average Health Index 0.0511** 0.0487** (0.0247) (0.0245) County average Health Index 0.1067* 0.1100** (0.0565) (0.0560) Household demographics No Yes No Yes Observations 564,944 564,944 570,279 570,279 95% confidence interval upper bound 0.100 0.097 0.217 0.220 (1) (2) (3) (4) ZIP code average Health Index 0.0511** 0.0487** (0.0247) (0.0245) County average Health Index 0.1067* 0.1100** (0.0565) (0.0560) Household demographics No Yes No Yes Observations 564,944 564,944 570,279 570,279 95% confidence interval upper bound 0.100 0.097 0.217 0.220 Notes. This table uses 2004–2016 Nielsen Homescan data at the household-by-year level. The sample excludes observations where less than 50% of trips to RMS stores are not in the household’s end-of-year county of residence. The Health Index is our overall measure of the healthfulness of grocery purchases and is normalized to have a mean of 0 and a standard deviation of 1 across households. Household demographics are natural log of income, natural log of years of education, age indicators, household size, race indicators, a married indicator, employment status, and weekly work hours. All regressions also control for year indicators and household fixed effects. Observations are not weighted for national representativeness. Robust standard errors, clustered by household and local area (ZIP code or county), are in parentheses. *, **, ***: statistically significant with 10%, 5%, and 1% confidence, respectively. Open in new tab Using these results, we can bound the extent to which location explains the nutrition–income relationship. We consider a partial equilibrium thought experiment in which an individual household moves from a low- to high-income retail environment, leaving aside general equilibrium effects that would occur if this happened on a large scale. The average household in the top quartile of income (residual of household size and age and year indicators) lives in a ZIP code (county) with a Health Index that is 0.11 (0.08) higher than households in the lowest income quartile. The upper bound of the confidence interval on τ for ZIP codes (counties) from Table III is about 0.10 (0.22), and the difference between the high- and low-income Health Index is 0.56 standard deviations. Thus, in combination with the assumption that any endogeneity would bias τ upward relative to the causal effect of place, we conclude that medium-term place effects can explain no more than 0.11 × |$\frac{0.10}{0.56}$| ≈ 2.0% of the high- versus low-income difference in the Health Index using cross–ZIP code moves, and no more than 0.08 × |$\frac{0.22}{0.56}$| ≈ 3.2% using cross-county moves. Although we have the power to detect statistically significant associations, it is clear that place effects do not explain a large share of nutritional inequality. V. A Model of Grocery Demand In this section, we estimate a structural model of grocery demand and use the estimates to decompose the nutrition–income relationship into supply-side and demand-side factors. We then use the model to evaluate counterfactual policies that could reduce nutritional inequality. We build our structural approach on the framework introduced by Dubois, Griffith, and Nevo (2014), although our estimation strategy will differ. Similar to the aisles of a typical store, we assume that there are J product groups indexed j = 1, …, J, such as milk, carbonated soft drinks, and bread. Each product group has |$\mathcal {K}_{j}$| food products (UPCs) indexed |$k=1,\ldots,\mathcal {K}_{j}$|⁠. We denote a household’s consumption of product k in group j (measured in calories) as ykj, and we denote the corresponding price paid per calorie as pkj. We let x denote the composite good capturing all other expenditures, with price normalized to px = 1. Each product k in group j is characterized by C characteristics {akj1, …, akjC}, which could include flavor, health implications, shelf life, and so on. We denote the household’s total consumption of each characteristic c = 1, …, C as zc = ∑j∑kakjcykj. Following Dubois, Griffith, and Nevo (2014), we assume household’s preferences for food are defined by a utility function with constant elasticity of substitution preferences over calories from each of |$\mathcal {K}_{j}$| products within each product group j, Cobb-Douglas preferences over J product groups, and linear preferences over characteristics: $$\begin{equation} U\left(x,\boldsymbol{z},\boldsymbol{y};\Theta ,\Psi \right)=\sum _{j=1}^{J}\mu _{j}\ln \left(\sum _{k=1}^{\mathcal {K}_{j}}\Psi _{kj}y_{kj}^{\theta _{j}}\right)+\sum _{c=1}^{C}\beta _{c}z_{c}+\lambda x. \end{equation}$$ (5) The parameter μj captures a household’s satiation rate over calories consumed in group j; θj determines the household’s satiation rate over calories consumed through product k in group j; Ψkj allows for perceived product differentiation, so that the household’s marginal benefit of calories can differ across products within a group; λ represents the marginal utility of consuming the outside good; and βc represents the marginal utility of consuming characteristic c. Let Kj denote the number of products that the household actually purchases in group j. The first-order conditions from maximizing equation (5) subject to a budget constraint can be summed over all products purchased in group j and rewritten as follows: $$\begin{equation} \sum _{k=1}^{K_{j}}p_{kj}y_{kj}^{*}=\sum _{c=1}^{C}\frac{\beta _{c}}{\lambda }\sum _{k=1}^{K_{j}}a_{kjc}y_{kj}^{*}+\frac{\mu _{j}\theta _{j}}{\lambda },\text{ }j=1,\ldots,J. \end{equation}$$ (6) The term |$\frac{\mu _{j}\theta _{j}}{\lambda }$| represents what the household would spend on product group j if products in that group had zero characteristics, and the term |$\sum _{c=1}^{C}\frac{\beta _{c}}{\lambda }\sum _{k=1}^{K_{j}}a_{kjc}y_{kj}^{*}$| captures the household’s additional expenditures in group j due to the products’ characteristics. A household will spend more in group j (the left side will be larger) if it satiates more slowly in that group (μj and θj are larger) or if it gets more characteristics that it values from that group (akjc is larger for characteristics with more positive βc). Higher marginal utility of outside good consumption λ reduces grocery expenditures. VI. Estimation and Results VI.A. Empirical Model To apply the model to data, let i = 1, …, I index households, and let t = 1, …, T index years. As shown in Online Appendix D.A, we can aggregate the first-order conditions from equation (6) over time to obtain the following annual calorie demand: $$\begin{equation} \ln Y_{ijt}=-\ln \left(\tilde{p}_{ijt}-\sum _{c=2}^{C}\tilde{\beta }_{c}\tilde{a}_{ijct}-\xi \right)+\delta _{j}+\phi _{m}+\phi _{t}+\varepsilon _{ijt}, \end{equation}$$ (7) where Yijt is the total calories consumed by household i in product group j during year t; |$\tilde{p}_{ijt}$| and |$\tilde{a}_{ijct}$|⁠, respectively, are the average price paid per calorie and average amount of characteristic c per calorie for household i’s purchases in group j in year t; |$\tilde{\beta }_{c}=\frac{\beta _{c}}{\lambda }$| is the money-metric marginal utility of characteristic c; and ξ is a product characteristic that is unobserved to the econometrician (indexed c = 1). The model also includes product group, market, and year fixed effects (δj, φm, and φt, respectively), and a random demand disturbance, ϵijt. We allow different parameters for each income quartile. Equation (7) uses intuitive variation to estimate the preference parameters. The independent variable inside the parentheses is an “implicit price”: the actual price adjusted for the utility value of the characteristics in product group j. As this implicit price increases, quantity purchased decreases. Despite using a Cobb-Douglas functional form that typically restricts product group price elasticities to 1, variation in the level of the unobserved characteristic will lead to variation in price elasticities. Specifically, an income group’s price elasticity is determined by the absolute magnitudes of |$\tilde{\beta }_{c}$| and ξ: larger (smaller) |$\tilde{\beta }_{c}$| and ξ parameters scale down (up) the importance of price variation in determining quantity purchased Yijt. By allowing these preference parameters to vary by income group, we allow different income groups to differentially value both characteristics and price. VI.B. Price Endogeneity A worrisome source of endogeneity arises from a potential correlation between a household’s idiosyncratic product group preferences ϵijt and the household’s average price paid |$\tilde{p}_{ijt}$|⁠: $$\begin{equation} \mathbb {E}\left(\varepsilon _{ijt}\tilde{p}_{ijt}\right)\ne 0. \end{equation}$$ (8) Such endogeneity could arise if households shop at stores offering lower prices for product groups on which they spend more or if households seek out systematically different quality levels (and thus price levels) in groups on which they spend more. Endogeneity could also arise through simultaneity bias, in which retailers set higher markups in response to higher demand. To address the possibility of price endogeneity, we develop a new instrument for prices. The underlying intuition for the instrument is that retail chains differ in their sourcing and distribution costs across products, giving different chains heterogeneous comparative advantages in supplying different products. Then, because different chains are present in different geographic areas, the relative prices of different products also vary across areas. To illustrate, consider a simple example in which there are two types of foods, apples and pizza, and two grocery chains, Safeway and Shaw’s. Suppose that Safeway is able to supply pizza cheaply, and Shaw’s can supply apples cheaply. In this case, areas dominated by Safeway will have relatively low pizza prices, and areas dominated by Shaw’s will have relatively low apple prices. Conversations with insiders from the grocery industry suggest that several factors contribute to differences in comparative advantage across retail chains within a product group. First, different products are produced in different parts of the country, generating transportation costs that vary across retail chains in different regions. Second, some retailers are larger than others, and economies of scale vary across product groups. Third, although the Robinson-Patman Act prohibits wholesale price discrimination, the act is increasingly unenforced (Lipman 2012), and producers often offer more subtle contractual incentives that generate variation in effective marginal cost across retailers. Fourth, grocers are increasingly offering private-label brands and sometimes even vertically integrating into production of some products, generating differential pricing advantages across categories. For example, Walmart makes some of its own milk, Lidl makes its own ice cream, and large chains benefit from economies of scale in certain categories when purchasing from private-label manufacturers (Frank 2012; Watson 2017; Boss 2018; Parker et al. 2018). We construct our instrument as follows. For retail chain r in market (i.e., county) m during time period t, let ln (pkrt, −m) denote the average log price of UPC k in stores from the same chain but in all markets excluding market m. Similarly, let ln (pkt, −m) denote the national average log price of UPC k in period t in all markets excluding m. We exclude market m to ensure that the IV reflects a chain’s comparative advantages in supplying product k based on other markets, not local demand conditions in market m. Retail chain r’s cost advantage in supplying UPC k relative to the national average is thus Δln (pkrt, −m) = ln (pkrt, −m) − ln (pkt, −m). Let Nrmt denote retailer r’s number of establishments in market m in year t, let Njrt denote the average sales per store of a UPC in product group j at retailer r in year t, and let Nkt denote the total calories of product k sold nationwide in year t. The price instrument Pjmt is the weighted average cost advantage that chains in market m have for UPCs in product group j: $$\begin{equation} P_{jmt}=\frac{\sum _{r\in m}N_{rmt}N_{jrt}\cdot \sum _{k=1}^{\mathcal {K}_{j}}N_{kt}\Delta \ln (p_{krt,-m})}{\sum _{r\in m}N_{rmt}N_{jrt}\cdot \sum _{k=1}^{\mathcal {K}_{j}}N_{kt}}. \end{equation}$$ (9) The variation in our instrument comes from the interaction between retail chains’ differing pricing advantages, Δln (pkrt, −m), and their differing presence across geographic markets, NrmtNjrt. Because equation (7) includes product group, market, and year fixed effects, identification comes only from variation in the relative prices across product groups within a market in a given year.23 Online Appendix D.C presents a series of additional tables and figures illustrating the variation in this instrument. Online Appendix Table A8 shows that most of the variation within product groups is explained by retailer-specific pricing variation that is constant across counties, not county-specific pricing variation that is constant across retailers. This implies that little of our identification could possibly come from endogenous pricing responses to variation in consumer preferences across markets. Online Appendix Figure A11 maps the geographic presence of the five largest retailers in RMS, illustrating substantial variation within and between regions of the country. Online Appendix Figure A14 shows that those five retailers set different relative prices for four example product groups. Online Appendix Figure A15 illustrates the resulting geographic variation in the instrument for those four product groups. Produce is predicted to be cheap on the West Coast and expensive on the East Coast. This is likely because so much produce is grown in California and there are material transport costs, so grocery chains on the West Coast can source produce more cheaply. Yogurt is predicted to be cheap for a dispersed set of counties in the Midwest, and cookies are predicted to be relatively cheap in Massachusetts and western Texas. The figures make clear that there is substantial within-region variation, that this variation is not closely related to county income, and that there are substantial differences in the spatial patterns across product groups. Online Appendix Figure A16 plots the standard deviation in our instrument for the six product “departments” (broadly aggregated product categories as defined by Nielsen), after residualizing against year, market, and product group fixed effects. Fresh produce has the most variation, followed in order by dairy, frozen foods, packaged meats and deli items, and dry grocery. This ordering is consistent with how costs vary across chains. Because produce is grown more in certain areas of the country, transportation costs differ considerably across chains in different regions. Furthermore, fresh produce and dairy require refrigeration and are highly perishable, making their cost quite sensitive to a chain’s distribution network. By contrast, dry grocery items require no refrigeration and have long shelf lives, and are thus equally easy for different retailers to transport. The instrument is very powerful. Online Appendix Figure A17 shows that there is a robust linear relationship between log prices and the instrument, controlling for market and product group fixed effects. A linear version of our IV procedure has first-stage F-statistics ranging from 243 to 260 in the four income groups. Our identifying assumption is that household i’s idiosyncratic preferences for product group j are uncorrelated with the price instrument Pjmt for group j in household i’s market: $$\begin{equation} \mathbb {E}\left(\varepsilon _{ijt}P_{jmt}\right)=0. \end{equation}$$ (10) The key economic content of our identifying assumption is that chains do not have comparative pricing advantages in product groups where their customers have unobservably stronger tastes. Of course, we cannot test whether the instrument is correlated with unobserved tastes. We can, however, present suggestive tests of whether the instrument is correlated with tastes predicted by observable characteristics. The economic forces that would violate the exclusion restriction would probably generate correlations between the instrument and both econometrically predictable and unpredictable tastes. If the instrument is valid, we should expect to see no relationship between the instrument and predictable tastes. For example, imagine that high-income households demand more produce, and low-income households demand more pasta. If our instrument is valid, we should expect no relationship between Pjmt and the consumption of produce versus pasta as predicted by county income. We implement two tests, each of which predicts tastes in a different way. First, we predict purchases of product group j using household demographics and then predict county-level purchases on the basis of county average demographics. We find that the instrument is not associated with predicted purchases conditional on our standard set of county, product group, and year fixed effects. Second, on the basis of the nutrition–income relationship documented in Section III.A, we recognize that higher county income predicts more purchases of healthy foods. We find that the instrument is not systematically different in high–Health Index product groups in low- versus high-income counties. See Online Appendix Table A9 for details. This instrument is novel in the literature, and it can be used in situations in which other instruments do not generate identification or fail the exclusion restriction. DellaVigna and Gentzkow (2019), for example, use price variation from individual stores’ short-term promotions. Although this variation can identify a store’s residual demand elasticity, it does not identify consumers’ demand elasticity if consumers substitute across stores. Hausman (1996) uses variation in prices over time in other markets, which is valid only under the assumption that demand shocks are uncorrelated across markets. By contrast, our instrument generates cross-sectional identification, while relying on an exclusion restriction that could be relatively plausible in many applications. VI.C. Method of Moments Estimation For estimation, we construct separate data sets for four household income quartiles, where income is residual of household size and age and year indicators to be consistent with the earlier parts of the article. Data are at the household-by–product group–by-year level. We define J = 45 product groups using a slight modification of Nielsen’s original “product group” variable, combining a handful of groups with infrequent purchases so as to minimize observations with zero purchases. We drop any remaining observations with zero purchases, because the first-order condition does not hold for these observations.24 We define C − 1 = 12 observed product characteristics for the 12 dietary components that enter the Health Index: grams of sodium per 1,000 calories, ounces of whole grains per 1,000 calories, and so on. For each income group, we estimate four parameter vectors: the (C − 1) × 1 vector |$\tilde{\boldsymbol{\beta }}$| of preferences for observed characteristics, the scalar ξ representing the unobserved characteristic, the J × 1 vector |$\boldsymbol{\delta }$| of product group fixed effects, and the M × 1 vector |$\boldsymbol{\phi }$| of market fixed effects. To specify the moment conditions, let |$\boldsymbol{D}_{j}$| be a J × 1 vector of dummy variables for whether the observation is in product group j, and let |$\boldsymbol{D}_{m}$| be an M × 1 vector of dummy variables for whether the household is in market m. The model estimation relies on the following set of (C + J + M) identifying moments that just identify our model parameters: $$\begin{eqnarray} \mathbb {E}\left((\delta _{j}+\varepsilon _{ijt})\tilde{a}_{ijct}\right) & =&0 ,\text{ }c=1,\ldots,C\\ \mathbb {E}\left(\varepsilon _{ijt}P_{jmt}\right) & =&0\nonumber\\ \mathbb {E}\left(\varepsilon _{ijt}D_{ijt}\right) & =&0 ,\text{ }j=1,\ldots,J\nonumber\\ \mathbb {E}\left(\varepsilon _{ijt}D_{im}\right) & =&0 ,\text{ }m=1,\ldots,M .\nonumber \end{eqnarray}$$ (11) Loosely, the first set of moments identifies the |$\tilde{\boldsymbol{\beta }}$| parameters, the second identifies ξ, the third set identifies |$\boldsymbol{\delta }$|⁠, and the fourth set identifies |$\boldsymbol{\phi }$|⁠. The first set of moments identifies characteristic preferences using two types of variation. First, we use the variation between product groups: we infer that |$\tilde{\beta }_{c}$| for sodium is high if consumers spend more on product groups with high sodium. Second, we use variation across households within product groups: we also infer that |$\tilde{\beta }_{c}$| for sodium is high if consumers who purchase especially salty products within a group purchase more calories in that group. See Online Appendix D.B for details about the method of moments estimator for our model parameters, |$\left(\hat{\boldsymbol{\delta }},\hat{\boldsymbol{\phi }},\hat{\tilde{\boldsymbol{\beta }}},\hat{\xi }\right)^{\prime }$|⁠, and their standard errors. VI.D. Estimation Results Table IV reports the estimated characteristic preference parameters |$\hat{\tilde{\beta }}_{c}$| for the four income quartiles. The |$\hat{\tilde{\beta }}_{c}$| parameters represent willingness to pay (WTP) in dollars per unit of characteristic c, where the unit is as originally specified by the HEI. For example, produce is measured in cups, whereas protein is measured in ounces. The normalization of |$\hat{\tilde{\beta }}_{c}$| into dollar units removes differences in λ, the marginal utility of money, across income groups. Table IV Preferences for Nutrients by Household Income Income quartile Sodium Whole fruit Other fruit Whole grains Refined grains Greens, beans Other veg (1) (2) (3) (4) (5) (6) (7) Income Q1 −0.242*** −0.252*** 0.177*** 0.252*** 0.022*** 0.965*** −0.438*** (0.009) (0.017) (0.004) (0.004) (0.001) (0.009) (0.014) Income Q2 −0.351*** −0.092*** 0.215*** 0.312*** 0.036*** 1.02*** −0.307*** (0.011) (0.012) (0.002) (0.006) (0.001) (0.009) (0.010) Income Q3 −0.432*** −0.057*** 0.261*** 0.363*** 0.045*** 1.16*** −0.319*** (0.012) (0.011) (0.001) (0.006) (0.001) (0.011) (0.010) Income Q4 −0.634*** −0.006 0.331*** 0.477*** 0.073*** 1.49*** −0.337*** (0.022) (0.014) (0.001) (0.012) (0.003) (0.021) (0.014) Dairy Sea, plant protein Meat protein Added sugar Solid fats Unobserved characteristic WTP for Health Index (8) (9) (10) (11) (12) (13) (14) Income Q1 0.144*** −0.275*** 0.061*** 0.0002*** 0.0005*** −0.00076*** 0.429*** (0.001) (0.008) (0.001) (0.0001) (0.00001) (0.00003) (0.011) Income Q2 0.163*** −0.313*** 0.055*** −0.0007*** 0.0009*** −0.0010*** 0.631*** (0.001) (0.009) (0.001) (0.0001) (0.00002) (0.00004) (0.006) Income Q3 0.189*** −0.359*** 0.057*** −0.0018*** 0.001*** −0.0013*** 0.820*** (0.001) (0.009) (0.001) (0.0001) (0.00002) (0.00004) (0.003) Income Q4 0.245*** −0.450*** 0.059*** −0.004*** 0.001*** −0.021*** 1.141*** (0.002) (0.014) (0.001) (0.0002) (0.00003) (0.00008) (0.003) Income quartile Sodium Whole fruit Other fruit Whole grains Refined grains Greens, beans Other veg (1) (2) (3) (4) (5) (6) (7) Income Q1 −0.242*** −0.252*** 0.177*** 0.252*** 0.022*** 0.965*** −0.438*** (0.009) (0.017) (0.004) (0.004) (0.001) (0.009) (0.014) Income Q2 −0.351*** −0.092*** 0.215*** 0.312*** 0.036*** 1.02*** −0.307*** (0.011) (0.012) (0.002) (0.006) (0.001) (0.009) (0.010) Income Q3 −0.432*** −0.057*** 0.261*** 0.363*** 0.045*** 1.16*** −0.319*** (0.012) (0.011) (0.001) (0.006) (0.001) (0.011) (0.010) Income Q4 −0.634*** −0.006 0.331*** 0.477*** 0.073*** 1.49*** −0.337*** (0.022) (0.014) (0.001) (0.012) (0.003) (0.021) (0.014) Dairy Sea, plant protein Meat protein Added sugar Solid fats Unobserved characteristic WTP for Health Index (8) (9) (10) (11) (12) (13) (14) Income Q1 0.144*** −0.275*** 0.061*** 0.0002*** 0.0005*** −0.00076*** 0.429*** (0.001) (0.008) (0.001) (0.0001) (0.00001) (0.00003) (0.011) Income Q2 0.163*** −0.313*** 0.055*** −0.0007*** 0.0009*** −0.0010*** 0.631*** (0.001) (0.009) (0.001) (0.0001) (0.00002) (0.00004) (0.006) Income Q3 0.189*** −0.359*** 0.057*** −0.0018*** 0.001*** −0.0013*** 0.820*** (0.001) (0.009) (0.001) (0.0001) (0.00002) (0.00004) (0.003) Income Q4 0.245*** −0.450*** 0.059*** −0.004*** 0.001*** −0.021*** 1.141*** (0.002) (0.014) (0.001) (0.0002) (0.00003) (0.00008) (0.003) Notes. This table presents GMM estimates of the nutrient preference parameters |$\tilde{\beta }_{c}$| from equation (7), separately for the four quartiles of income (residual of household size and age and year indicators). Magnitudes represent willingness to pay for a unit of the nutrient, where the units are those used in the Healthy Eating Index. Sodium is in grams; whole fruit, other fruit and dairy are in cups; whole grains, refined grains, and both types of protein are in ounces, added sugar is in teaspoons; solid fats are in calories. “WTP for Health Index” in column (14) equals |$\sum _{c}\hat{\tilde{\beta }}_{c}s_{c}r_{c}$|⁠, where sc is the maximum possible score on the Healthy Eating Index for dietary component c, and rc is the difference in consumption of component c to receive the maximum instead of the minimum score. Standard errors, clustered by household, are in parentheses. *, **, ***: statistically significant with 10%, 5%, and 1% confidence, respectively. Open in new tab Table IV Preferences for Nutrients by Household Income Income quartile Sodium Whole fruit Other fruit Whole grains Refined grains Greens, beans Other veg (1) (2) (3) (4) (5) (6) (7) Income Q1 −0.242*** −0.252*** 0.177*** 0.252*** 0.022*** 0.965*** −0.438*** (0.009) (0.017) (0.004) (0.004) (0.001) (0.009) (0.014) Income Q2 −0.351*** −0.092*** 0.215*** 0.312*** 0.036*** 1.02*** −0.307*** (0.011) (0.012) (0.002) (0.006) (0.001) (0.009) (0.010) Income Q3 −0.432*** −0.057*** 0.261*** 0.363*** 0.045*** 1.16*** −0.319*** (0.012) (0.011) (0.001) (0.006) (0.001) (0.011) (0.010) Income Q4 −0.634*** −0.006 0.331*** 0.477*** 0.073*** 1.49*** −0.337*** (0.022) (0.014) (0.001) (0.012) (0.003) (0.021) (0.014) Dairy Sea, plant protein Meat protein Added sugar Solid fats Unobserved characteristic WTP for Health Index (8) (9) (10) (11) (12) (13) (14) Income Q1 0.144*** −0.275*** 0.061*** 0.0002*** 0.0005*** −0.00076*** 0.429*** (0.001) (0.008) (0.001) (0.0001) (0.00001) (0.00003) (0.011) Income Q2 0.163*** −0.313*** 0.055*** −0.0007*** 0.0009*** −0.0010*** 0.631*** (0.001) (0.009) (0.001) (0.0001) (0.00002) (0.00004) (0.006) Income Q3 0.189*** −0.359*** 0.057*** −0.0018*** 0.001*** −0.0013*** 0.820*** (0.001) (0.009) (0.001) (0.0001) (0.00002) (0.00004) (0.003) Income Q4 0.245*** −0.450*** 0.059*** −0.004*** 0.001*** −0.021*** 1.141*** (0.002) (0.014) (0.001) (0.0002) (0.00003) (0.00008) (0.003) Income quartile Sodium Whole fruit Other fruit Whole grains Refined grains Greens, beans Other veg (1) (2) (3) (4) (5) (6) (7) Income Q1 −0.242*** −0.252*** 0.177*** 0.252*** 0.022*** 0.965*** −0.438*** (0.009) (0.017) (0.004) (0.004) (0.001) (0.009) (0.014) Income Q2 −0.351*** −0.092*** 0.215*** 0.312*** 0.036*** 1.02*** −0.307*** (0.011) (0.012) (0.002) (0.006) (0.001) (0.009) (0.010) Income Q3 −0.432*** −0.057*** 0.261*** 0.363*** 0.045*** 1.16*** −0.319*** (0.012) (0.011) (0.001) (0.006) (0.001) (0.011) (0.010) Income Q4 −0.634*** −0.006 0.331*** 0.477*** 0.073*** 1.49*** −0.337*** (0.022) (0.014) (0.001) (0.012) (0.003) (0.021) (0.014) Dairy Sea, plant protein Meat protein Added sugar Solid fats Unobserved characteristic WTP for Health Index (8) (9) (10) (11) (12) (13) (14) Income Q1 0.144*** −0.275*** 0.061*** 0.0002*** 0.0005*** −0.00076*** 0.429*** (0.001) (0.008) (0.001) (0.0001) (0.00001) (0.00003) (0.011) Income Q2 0.163*** −0.313*** 0.055*** −0.0007*** 0.0009*** −0.0010*** 0.631*** (0.001) (0.009) (0.001) (0.0001) (0.00002) (0.00004) (0.006) Income Q3 0.189*** −0.359*** 0.057*** −0.0018*** 0.001*** −0.0013*** 0.820*** (0.001) (0.009) (0.001) (0.0001) (0.00002) (0.00004) (0.003) Income Q4 0.245*** −0.450*** 0.059*** −0.004*** 0.001*** −0.021*** 1.141*** (0.002) (0.014) (0.001) (0.0002) (0.00003) (0.00008) (0.003) Notes. This table presents GMM estimates of the nutrient preference parameters |$\tilde{\beta }_{c}$| from equation (7), separately for the four quartiles of income (residual of household size and age and year indicators). Magnitudes represent willingness to pay for a unit of the nutrient, where the units are those used in the Healthy Eating Index. Sodium is in grams; whole fruit, other fruit and dairy are in cups; whole grains, refined grains, and both types of protein are in ounces, added sugar is in teaspoons; solid fats are in calories. “WTP for Health Index” in column (14) equals |$\sum _{c}\hat{\tilde{\beta }}_{c}s_{c}r_{c}$|⁠, where sc is the maximum possible score on the Healthy Eating Index for dietary component c, and rc is the difference in consumption of component c to receive the maximum instead of the minimum score. Standard errors, clustered by household, are in parentheses. *, **, ***: statistically significant with 10%, 5%, and 1% confidence, respectively. Open in new tab Among the eight “healthy” dietary components, six (whole fruit, other fruit, whole grains, greens and beans, vegetables other than greens and beans, and dairy) display a strong and almost uniformly monotonic increase in WTP as household income increases. The only healthy characteristic not fitting this pattern is protein from fish and plants. All households dislike this type of protein, especially high-income households. Meat protein is valued similarly across the income distribution. Among the four “unhealthy” dietary components, two (sodium and added sugar) are more strongly disliked by high-income households. The estimated WTP for added sugar is especially striking: bottom income quartile households are willing to pay $0.0002 to consume a gram of added sugar, whereas top income quartile households are willing to pay $0.004 to avoid consuming a gram of added sugar. Added sugar is the only component for which high- and low-income households have opposite-signed preferences, highlighting substantial differences in preferences for added sugar across the income distribution. High-income households have stronger preferences for the remaining two unhealthy components, refined grains and solid fats. The magnitudes of some preference differences are large: the highest-income quartile dislikes sodium nearly three times as much and likes dairy about twice as much as the lowest-income quartile. We also find that higher-income households have the lowest estimated unobserved characteristic. Accordingly, higher-income households are less sensitive to prices. To derive a summary statistic for overall preferences for healthy eating, we sum the |$\hat{\tilde{\beta }}_{c}$|⁠, weighting by the healthfulness of each dietary component. To implement this, recall that the HEI grants points for consuming more “healthy” components and fewer “unhealthy” components, with a minimum component score of 0 and a maximum score of 5 or 10 achieved at minimum and maximum thresholds. If sc ∈ {5, 10} is the maximum HEI score for dietary component c and rc is the difference in consumption of component c per 1,000 calories to receive the maximum instead of the minimum score (with rc > 0 for “healthy” components and rc < 0 for “unhealthy” components), column (14) reports |$\sum _{c}\hat{\tilde{\beta }}_{c}s_{c}r_{c}$|⁠. All income groups value healthy groceries, but the highest-income group is willing to pay the most, making healthy eating a normal good. The lowest-income quartile is willing to pay $0.43 per 1,000 calories to consume the maximum-scoring bundle, while the highest-income quartile is willing to pay $1.14 per 1,000 calories. The |$\hat{\tilde{\beta }}_{c}$| parameters are most safely interpreted as preferences for dietary component c and any unmeasured correlates. For example, shelf life and preparation time could be correlated with characteristics such as salt or added sugar content. To consider this possibility, we collected data on each UPC’s shelf life from the U.S. government’s FoodKeeper app (U.S. Department of Health and Human Services 2015) and convenience of preparation from Okrent and Kumcu (2016). Online Appendix Table A10 reports the model estimates including these two additional characteristics. Higher-income households have stronger preferences for convenience and fresher foods with shorter shelf lives, but the patterns of preferences for dietary components across the income distribution are similar to the primary estimates. WTP to consume the maximum-scoring bundle of dietary components is now lower for all income quartiles: $0.20 per 1,000 calories for the lowest-income quartile and $0.63 for the highest-income quartile. Even though these dollar values are lower, the ratio of the high-income WTP to the low-income WTP is now larger: |$\frac{0.63}{0.20}$| ≈ 3.15, compared with |$\frac{\$1.14}{\$0.43}$| ≈ 2.65 in the primary specification. As discussed in Section VI.B, a key identifying assumption is that the variation in retail chains’ prices across product groups is uncorrelated with unobserved consumer preferences. One key concern might be that retailer chains develop comparative advantages or set prices endogenously in response to local demand in the regions where they operate. To consider this, we reestimate the model including additional fixed effects that absorb possible variation in local demand. Specifically, we add (i) census-region-by–product group fixed effects, (ii) census-region-by–product group fixed effects interacted with an indicator for whether the county is designated as urban or rural, and (iii) census-region-by–product group fixed effects interacted with an indicator for whether the county median income is above the national median. As shown in Online Appendix Table A11, the estimates are all very similar. Thus, any concerns about the exclusion restriction must derive from a model in which adding or removing these types of controls for preference variation does not affect the main conclusions. VII. Explaining and Reducing Nutritional Inequality VII.A. Decomposing Consumption Differences into Supply versus Demand Factors Using the model estimates from Section VI.D, we decompose the observed differences in healthy eating across income groups into underlying supply-side factors (prices and availability) and demand-side factors (preferences for product groups and characteristics). Because our model is estimated at the product group level, our counterfactuals only allow households to reoptimize their calorie demand across product groups. We do not analyze how households would change their relative quantities across UPCs within product groups. We now index parameters for the four household income groups (residual of household size and age and year indicators) by g ∈ {1, 2, 3, 4}. For each income group, we construct a representative product for each product group, and we calculate the resulting representative price |$\tilde{p}_{gj}$|⁠, observed characteristics |$\tilde{a}_{gjc}$|⁠, and Health Index Hgj. This representative product is the weighted average of products available in RMS stores where each income group shops, weighting stores by their share of nationwide trips and weighting UPCs by their share of nationwide calorie consumption.25 For a given set of prices, observed characteristics, and parameter estimates, we can predict |$\hat{Y}_{gj}$|⁠, the total calories that income group g consumes within product group j: $$\begin{equation} \hat{Y}_{gj}=\frac{\exp (\hat{\delta }_{gj})}{\tilde{p}_{gj}-\sum _{c=2}^{C}\hat{\tilde{\beta }}_{gc}\tilde{a}{}_{gjc}-\hat{\xi }_{g}}. \end{equation}$$ (13) This equation excludes |$\hat{\phi }_{gm}$|⁠, as these fixed effects proportionally scale consumption of all product groups, leaving the overall Health Index unaffected. We can then calculate the overall Health Index of the grocery purchases characterized by the predicted |$\hat{Y}_{gj}$|⁠: $$\begin{equation} \hat{H}_{g}=\sum _{j}\hat{Y}_{gj}H_{gj}. \end{equation}$$ (14) For this subsection, we renormalize the Health Index so that the initial difference between the highest and lowest income groups equals 1. The leftmost column of points in Figure VI displays each income group’s initial Health Index level, calculated by substituting the predictions of equation (13) into equation (14). Figure VI Open in new tabDownload slide Predicted Health Index for Each Income Group Each category on the x-axis represents a separate counterfactual calculation. Income groups are quartiles of income, residual of household size and age and year indicators. The base category measures the Health Index for each income group when each group retains their own preferences and faces their own local supply conditions. The second category sets all prices to those observed for the high-income group. The third category sets all prices and product nutrient characteristics to those in the high-income group. The fourth and fifth categories, respectively, set nutrient preferences and product group preferences equal to those for the high-income group. The Health Index presented on the y-axis is renormalized so that the base difference between the highest- and lowest-income groups equals 1. Figure VI Open in new tabDownload slide Predicted Health Index for Each Income Group Each category on the x-axis represents a separate counterfactual calculation. Income groups are quartiles of income, residual of household size and age and year indicators. The base category measures the Health Index for each income group when each group retains their own preferences and faces their own local supply conditions. The second category sets all prices to those observed for the high-income group. The third category sets all prices and product nutrient characteristics to those in the high-income group. The fourth and fifth categories, respectively, set nutrient preferences and product group preferences equal to those for the high-income group. The Health Index presented on the y-axis is renormalized so that the base difference between the highest- and lowest-income groups equals 1. We can now simulate the changes in healthy eating that would occur under different counterfactual scenarios for supply- and demand-side factors. We begin with the supply side, motivated by the arguments that food deserts are a key cause of the nutrition–income relationship. Our first counterfactual measures the effect of prices by equalizing the prices that all income groups pay for each product group. To implement this, for each income group and product group, we recalculate consumption from equation (13) using |$\tilde{p}_{4j}$| instead of |$\tilde{p}_{gj}$|⁠, and then recalculate the overall Health Index using equation (14). The second column of points in Figure VI shows that equalizing prices has essentially no effect on the differences in the Health Index across income groups. Our next counterfactual measures the effect of availability of healthy versus unhealthy groceries, by also equalizing the observed characteristic levels for each product group. To implement this, we recompute each income group’s consumption and resulting Health Index using both |$\tilde{p}_{4j}$| and |$\tilde{a}{}_{4jc}$| in place of |$\tilde{p}_{gj}$| and |$\tilde{a}_{gjc}$|⁠. The third column of points in Figure VI shows that equalizing both prices and the availability of product characteristics decreases the Health Index difference between the highest and lowest income quartiles by 9%. This confirms our findings from Section IV: differences in supply do not explain very much of the nutrition–income relationship. We now explore the role of differences in demand. In addition to equalizing prices and observed characteristics, we equalize preferences for observed and unobserved characteristics.26 To implement this, we recompute consumption and the Health Index after additionally substituting |$\hat{\tilde{\beta }}_{4c}$| and |$\hat{\xi }_{4}$| in place of |$\hat{\tilde{\beta }}_{gc}$| and |$\hat{\xi }_{g}$|⁠. The fourth column of points in Figure VI shows that equalizing these preferences closes almost half of the remaining gap in healthy eating. Finally, we also set the product group preferences |$\hat{\delta }_{gj}$| equal to |$\hat{\delta }_{4j}$|⁠. As the rightmost column of points in the figure shows, this last counterfactual mechanically equalizes the observed purchases across each of the income groups. Equalizing demand patterns eliminates 91% of the original Health Index difference between the highest and lowest income quartiles. As a robustness check, Online Appendix Table A11 repeats this decomposition using our alternative estimates that add census-region-by–product group fixed effects and the interactions with urban/rural and above median income indicators. Across these three alternative specifications, we find that supply accounts for 7%–12% of the nutrition–income relationship, while demand accounts for 88%–93%. In summary, about 90% of the nutrition–income relationship is due to demand-side factors related to preferences, and only about 10% is explained by the supply side. Consistent with our event study analyses in Section IV, this finding counters arguments that food deserts are important contributors to nutritional inequality. VII.B. Using Observables to Explain Demand for Healthy Groceries The results of the previous section highlight that demand-side factors, not supply, are central to explaining the nutrition–income relationship. We ask the next natural question: what factors might explain why demand differs by income? The structural model allows us to make a key distinction: instead of analyzing equilibrium purchases of healthy groceries, we isolate each household’s demand for healthy groceries, holding constant supply conditions as measured by prices and observed characteristics. To do this, we compute the sample average prices and observed characteristics for each product group, denoted |$\bar{p}_{j}$| and |$\bar{a}_{jc}$|⁠. We also back out each observation’s fitted error term, |$\hat{\varepsilon }_{ijt}$|⁠. For household i in year t, we compute the total calorie demand in all product groups under sample average supply conditions: $$\begin{equation} \hat{Y}_{ijt}=\frac{\exp (\hat{\delta }_{gj}+\hat{\phi }_{gm}+\hat{\varepsilon }_{ijt})}{\bar{p}_{j}-\sum _{c=2}^{C}\hat{\tilde{\beta }}_{gc}\bar{a}{}_{jc}-\hat{\xi }_{g}}. \end{equation}$$ (15) We then insert |$\hat{Y}_{ijt}$| into an analogue of equation (14) to calculate |$\hat{H}_{it}^{D}$|⁠, the Health Index of household i’s grocery demand at sample average supply, and normalize |$\hat{H}_{it}^{D}$| to have a standard deviation of 1. We can now ask what observable characteristics mediate the relationship between income and healthy grocery demand. Consider the following regression: $$\begin{equation} \hat{H}_{it}^{D}=\alpha \ln w_{i}+\gamma ^{1}\boldsymbol{X}_{it}^{1}+\gamma ^{0}\boldsymbol{X}_{it}^{0}+\mu _{t}+\varepsilon _{i}, \end{equation}$$ (16) where wi denotes household i’s sample average income, μt are year indicators, and |$\boldsymbol{X}_{it}^{0}$| denotes household size and indicators for age. |$\boldsymbol{X}_{it}^{1}$| denotes the remaining demographics from Section IV (natural log of years of education, race indicators, an indicator for whether the household heads are married, employment status, and weekly work hours) plus two additional variables from the Homescan add-on survey carried out by Nielsen for Allcott, Lockwood, and Taubinsky (2019): the self-reported importance of staying healthy and nutrition knowledge, both normalized to have a standard deviation of 1. As discussed in Section III.A, we control for |$\boldsymbol{X}_{it}^{0}$| so as to interpret wi as a rough measure of the household’s SES, and we think of |$\boldsymbol{X}_{it}^{1}$| as possible mediators of the relationship between SES and healthy grocery demand. We use the approach of Gelbach (2016), which is to think of covariates |$\boldsymbol{X}_{it}^{1}$| as potentially “omitted variables” in the relationship between income and healthy grocery demand and calculate the “omitted variables bias” from excluding each specific covariate. Define |$\hat{\Gamma }_{v}$| as the conditional covariance between income and the individual variable |$\tilde{\pi }_{v}$| estimated in an auxiliary regression, and define |$\hat{\tilde{\alpha }}$| as the estimate of α from equation (16) excluding |$\boldsymbol{X}_{it}^{1}$|⁠. Variable |$\tilde{\pi }_{v}$|’s contribution to the relationship between income and healthy grocery demand is $$\begin{equation} \tilde{\pi }_{v}=\frac{\hat{\gamma }_{v}^{1}\hat{\Gamma }_{v}}{\hat{\tilde{\alpha }}}. \end{equation}$$ (17) As with standard omitted variable bias, a covariate will explain more of the relationship if it is more strongly associated with healthy grocery demand (measured by |$\hat{\gamma }_{v}^{1}$|⁠) or with income (measured by |$\hat{\Gamma }_{v}$|⁠). Dividing by |$\hat{\tilde{\alpha }}$| gives variable |$\tilde{\pi }_{v}$|’s estimated contribution as a share of the unconditional relationship. See Online Appendix Table A12 for estimates of equation (16) and the auxiliary regressions. Figure VII presents the estimated |$\tilde{\pi }_{v}$| parameters and 95% confidence intervals. Education explains the largest share of the relationship between demand for healthy groceries and income, at about 20%. Nutrition knowledge explains the second-largest share, at about 14%. These results are correlations, so they do not reflect the causal effect of additional education or nutrition knowledge interventions. That being said, they are suggestive of the roles that improved educational opportunities and nutrition information could play in reducing nutritional inequality. Figure VII Open in new tabDownload slide Explaining the Relationship between Income and Healthy Grocery Demand This figure presents the |$\tilde{\pi }_{v}$| parameters and 95% confidence intervals from equation (17), representing the share of the correlation between income and demand for healthy groceries that is explained by each variable. The “Census division” bar reflects the joint contribution of the vector of census division indicators. Figure VII Open in new tabDownload slide Explaining the Relationship between Income and Healthy Grocery Demand This figure presents the |$\tilde{\pi }_{v}$| parameters and 95% confidence intervals from equation (17), representing the share of the correlation between income and demand for healthy groceries that is explained by each variable. The “Census division” bar reflects the joint contribution of the vector of census division indicators. VII.C. Using Subsidies to Reduce Nutritional Inequality Our analyses suggest that supply-side policies, such as encouraging supermarket entry, will have limited effects on healthy eating. In this section, we study an alternative policy: subsidies for healthy foods. There are many types of taxes and subsidies that could affect healthy eating. To ease interpretation, we focus on a simple subsidy that scales in a product’s healthfulness and is available only to the bottom quartile of the income distribution, again conditional on household size and age and year indicators. Although the exact setup of these counterfactuals is developed to fit with our previous analyses, we think of this as a stylized implementation of a healthy food subsidy within SNAP, the U.S. government’s means-tested nutritional support program. Specifically, we consider an ad valorem subsidy for each product that is proportional to its Health Index. To target only healthy groceries, we set the subsidy to zero for all products with a below-median Health Index. We continue analyzing composite products representing the calorie-weighted average product sold in product group j to income group g. Denoting the median product’s Health Index as |$\bar{H}$|⁠, the percent subsidy for product j for households in the bottom income quartile (g = 1) is $$\begin{equation} s_{1j}=\left\lbrace \begin{array}{@{}l@{\quad }l@{}} \min \lbrace s(H_{1j}-\bar{H}),0.95\rbrace & \text{if $(H_{1j}-\bar{H})\gt 0$}\\ 0 & otherwise. \end{array}\right. \end{equation}$$ (18) We limit the subsidy to a maximum of 95% of the product group’s price, which binds in a few cases for the highest subsidy considered below. The bottom income quartile’s subsidized price for product group j is |$\tilde{p}_{1j}^{s}=\tilde{p}_{1j}(1-s_{1j})$|⁠, and their resulting consumption is $$\begin{equation} \hat{Y}_{1j}=\frac{\exp (\delta _{1j})}{\tilde{p}_{1j}^{s}-\sum _{c=2}^{C}\tilde{\beta }_{1c}\tilde{a}{}_{1jc}-\xi _{1}}. \end{equation}$$ (19) We again calculate the Health Index using equation (14), and we numerically solve for the subsidy s that increases the bottom income quartile’s Health Index by a given amount. Table V presents these results. Column (1) presents the subsidy that reduces the Health Index gap between top- and bottom-quartile households by 0.9%—the point estimate of the impact of supermarket entry from Section IV.A.27 Column (2) presents the subsidy that reduces the Health Index gap by 9%—the effect of equating supply in the primary estimate from Section VII.A. Column (3) presents the subsidy that brings bottom-quartile households’ Health Index to the level of the top quartile. The first two rows present the subsidy parameter s and the mean percent subsidy for product groups with an above-median Health Index, which are the product groups receiving nonzero subsidies. The third row presents the average subsidy payment per bottom-quartile household. The bottom row presents the total subsidy payment, aggregating over all 31.45 million households in the bottom income quartile. Table V Impacts of Means-Tested Healthy Grocery Subsidies Subsidy to close gap by 0.9% (estimated supermarket entry effect) Subsidy to close gap by 9% (structural estimate of equal supply conditions) Subsidy to close gap by 100% (1) (2) (3) Subsidy parameter s 0.000067 0.000657 0.00601 Mean subsidy for subsidized products 0.60% 5.8% 48.7% Average subsidy payment per household $2.62 $26.35 $336.10 Total subsidy, all bottom-quartile households $84 million $830 million $10.57 billion Subsidy to close gap by 0.9% (estimated supermarket entry effect) Subsidy to close gap by 9% (structural estimate of equal supply conditions) Subsidy to close gap by 100% (1) (2) (3) Subsidy parameter s 0.000067 0.000657 0.00601 Mean subsidy for subsidized products 0.60% 5.8% 48.7% Average subsidy payment per household $2.62 $26.35 $336.10 Total subsidy, all bottom-quartile households $84 million $830 million $10.57 billion Notes. This table presents the amounts of healthy grocery subsidies required to reduce the difference in the Health Index between top and bottom income quartile households by given percentages. The subsidies are available to bottom income quartile households in proportion to a product’s Health Index, conditional on the Health Index being above the median product’s Health Index. “Mean subsidy” is the percent discount among products receiving strictly positive subsidies. “Total subsidy, all bottom-quartile households” is the average subsidy payment per household multiplied by 31.45 million, the number of households in the bottom income quartile. Open in new tab Table V Impacts of Means-Tested Healthy Grocery Subsidies Subsidy to close gap by 0.9% (estimated supermarket entry effect) Subsidy to close gap by 9% (structural estimate of equal supply conditions) Subsidy to close gap by 100% (1) (2) (3) Subsidy parameter s 0.000067 0.000657 0.00601 Mean subsidy for subsidized products 0.60% 5.8% 48.7% Average subsidy payment per household $2.62 $26.35 $336.10 Total subsidy, all bottom-quartile households $84 million $830 million $10.57 billion Subsidy to close gap by 0.9% (estimated supermarket entry effect) Subsidy to close gap by 9% (structural estimate of equal supply conditions) Subsidy to close gap by 100% (1) (2) (3) Subsidy parameter s 0.000067 0.000657 0.00601 Mean subsidy for subsidized products 0.60% 5.8% 48.7% Average subsidy payment per household $2.62 $26.35 $336.10 Total subsidy, all bottom-quartile households $84 million $830 million $10.57 billion Notes. This table presents the amounts of healthy grocery subsidies required to reduce the difference in the Health Index between top and bottom income quartile households by given percentages. The subsidies are available to bottom income quartile households in proportion to a product’s Health Index, conditional on the Health Index being above the median product’s Health Index. “Mean subsidy” is the percent discount among products receiving strictly positive subsidies. “Total subsidy, all bottom-quartile households” is the average subsidy payment per household multiplied by 31.45 million, the number of households in the bottom income quartile. Open in new tab The results in column (1) show that an annual subsidy of $84 million would increase healthy eating by the same amount as one additional supermarket entry within a 10-minute drive of all bottom-quartile households. In comparison, the Healthy Food Financing Initiative has spent about $220 million of its $400 million budget on store subsidies (Reinvestment Fund 2017), and various state programs have spent tens of millions (CDC 2011). Of course, the impacts of these supply-side programs on store entry are unclear, healthy eating is not the only social objective, and government expenditures are not a complete measure of social costs. What we can conclude is that a supermarket entry subsidy of more than $2.62 per year per household within a 10-minute drive would be less cost-effective than this means-tested subsidy at increasing the Health Index of bottom-quartile households’ grocery purchases. Column (2) shows that an annual subsidy of $830 million would increase healthy eating by the same amount as providing bottom-quartile households the same supply conditions as top-quartile households. From this, we can similarly conclude that even a suite of supply-side policies that are somehow able to achieve this full equalization of supply will only be cost-effective if they cost less than $830 million a year. Column (3) shows that a subsidy of $11 billion a year could raise bottom-quartile households’ Health Index all the way to the level of top-quartile households. In comparison, the annual SNAP budget in 2016 was $71 billion (Center for Budget and Policy Priorities 2018). Thus, our model suggests that adding a healthy food subsidy to SNAP could eliminate this measure of nutritional inequality at an additional cost of only 15% of the SNAP budget.28 Our model requires many assumptions, so we view these results as suggestive and a potential motivation for demonstration projects, ideally using randomized experiments such as in Bartlett et al. (2014), to assess the effects of such policies. Furthermore, there are many economic and political considerations around modifying SNAP to encourage healthy eating (Shenkin and Jacobson 2010; Richards and Sindelar 2013; Schanzenbach 2017). However, this result on the cost-effectiveness of healthy-eating subsidies, combined with our earlier results on the ineffectiveness of supply-side policies, suggests that policy makers interested in reducing nutritional inequality might redirect efforts away from promoting access to healthy groceries and toward means-tested subsidies. VIII. Conclusion We study the causes of nutritional inequality: why the wealthy tend to eat more healthfully than the poor in the United States. The public health literature has documented that lower-income neighborhoods suffer from lower availability of healthy groceries and that lower-income households tend to eat less healthfully. In public policy circles and government, this relationship has been taken as causal, with significant policy attention devoted to improving access to healthy groceries in low-income neighborhoods. We test this hypothesis using several complementary empirical strategies. Entry of a new supermarket has economically small effects on healthy grocery purchases, and we can conclude that differential local supermarket density explains no more than about 1.5% of the difference in healthy eating between high- and low-income households. The data clearly show why this is the case: Americans travel a long way for shopping, so even people who live in food deserts with no supermarkets get most of their groceries from supermarkets. Entry of a new supermarket nearby therefore mostly diverts purchases from other supermarkets. This analysis reframes the discussion of food deserts in two ways. First, the notion of a food desert is misleading if it is based on a market definition that understates consumers’ willingness to travel. Second, any benefits of eliminating food deserts derive less from healthy eating and more from the consumer surplus gains due to increased local variety and decreased travel costs. In a second event study analysis, we find that moving to an area where other people eat more or less healthfully does not affect households’ own healthy eating patterns, at least over the several-year time horizon that the data allow. In combination with the assumption that any endogeneity would generate upward bias in our estimated place effects, we can conclude that partial equilibrium place effects explain no more than 3% of differences in healthy eating between high- and low-income households. Consistent with the event study analyses, decompositions based on our structural demand model suggest that fully equalizing supply conditions would reduce the difference in healthy eating between low- and high-income households by no more than about 10%. By contrast, our model suggests that a means-tested subsidy for healthy groceries could increase low-income households’ healthy eating to the level of high-income households at an additional cost of only about 15% of the current SNAP budget. Before advocating for or against such a subsidy, one would need to measure the relevant market failures and study optimal policy in a principled welfare maximization framework. However, our results do allow us to conclude that policies aimed at eliminating food deserts likely generate little progress toward a goal of reducing nutritional inequality. Supplementary Material An Online Appendix for this article can be found at The Quarterly Journal of Economics online. Code replicating tables and figures in this article can be found in Allcott et al. (2019), in the Harvard Dataverse, doi: 10.7910/DVN/MSOBYI. Footnotes * We thank Prottoy Aman Akbar, Yue Cao, Hae Nim Lee, and Catherine Wright for exceptional research assistance; Charles Courtemanche, Sungho Park, and Andrea Carlson for sharing data; and Marianne Bitler, Anne Case, David Cuberes, Amanda Chuan, Janet Currie, Jan De Loecker, Gilles Duranton, Joe Gyourko, Jakub Kastl, Ephraim Liebtag, Ilyana Kuziemko, Todd Sinai, Diane Whitmore Schanzenbach, Jesse Shapiro, Tom Vogl, and David Weinstein for helpful comments. We thank seminar participants at Amazon, the 2015 and 2017 ASSA Meetings, the Becker Friedman Institute at the University of Chicago, Brown, Columbia, Duke, the Federal Reserve Bank of Kansas City, the Federal Trade Commission, Microsoft Research, the 2015 and 2016 NBER Summer Institutes, New York University, the Paris School of Economics, Penn State, Princeton, the Pritzker School of Medicine, the Robert Wood Johnson Foundation, Stanford, Temple, Toronto, the Tilburg Christmas Research Camp, University of New South Wales, University of North Carolina, University of Pennsylvania, USC Marshall, the USDA, the University of Sydney, the 2014 Urban Economics Association Meeting, Warwick, Wharton, and Yale SOM. We are grateful for funding from the Chicago Booth Initiative on Global Markets, the Wharton Social Impact Initiative, the Research Sponsors’ Program of the Wharton Zell-Lurie Real Estate Center, and the USDA Economic Research Service. This article reflects the researchers’ own analyses calculated based in part on data from the Nielsen Company (US), LLC, and marketing databases provided through the Nielsen Datasets at the Kilts Center for Marketing Data Center at the University of Chicago Booth School of Business. The conclusions drawn from the Nielsen data are those of the researchers and do not reflect the views of Nielsen. Nielsen is not responsible for, had no role in, and was not involved in analyzing and preparing the results reported herein. The findings and conclusions in this publication have not been formally disseminated by the USDA and should not be construed to represent any agency determination or policy. This article subsumes and replaces our previous work, Handbury, Rahkovsky, and Schnell (2015) and Allcott, Diamond, and Dubé (2017). 1. See, for example, Alwitt and Donley (1997); Horowitz et al. (2004); Jetter and Cassady (2005); Algert, Agrawal, and Lewis (2006); Baker et al. (2006); Powell et al. (2007); Larson, Story, and Nelson (2009); Sharkey, Horel, and Dean (2010). 2. For example, former First Lady Michelle Obama argues that “it’s not that people don’t know or don’t want to do the right thing; they just have to have access to the foods that they know will make their families healthier” (Curtis 2011). Similarly, in their influential overview article, Bitler and Haider (2011) observe that “it appears that much of the existing research implicitly assumes that supply-side factors cause any food deserts that exist.” Countering this assumption, Bitler and Haider (2011) conclude that “we do not have sufficient evidence to determine whether food deserts are systematically the cause” of unhealthy eating by low-income people. 3. In the United States, such policies include the Healthy Food Financing Initiative (HFFI; Reinvestment Fund), parts of the Agricultural Act of 2014 (Aussenberg 2014), the Affordable Care Act’s Community Transformation Grants, and state programs described in CDC (2011). Notably, since 2011, the HFFI has awarded nearly $200 million to community development organizations and has leveraged more than $1 billion in private investments and tax credits (Food Trust 2019). 4. Although not solely focused on retail environments, recent work further examines how changes in a range of socio-environmental factors, including restaurant and food store availability and food prices, have contributed to rising obesity over the past 50 years (Cutler, Glaeser, and Shapiro 2003; Chou, Grossman, and Saffer 2004; Lakdawalla, Philipson, and Bhattacharya 2005; Rashad 2006; Rashad, Grossman, and Chou 2006; Baum and Chou 2016; Courtemanche et al. 2016). 5. The National Health and Nutrition Examination Survey (NHANES) finds that Americans consume 34% of calories away from home, including 25% in restaurants (U.S. Department of Agriculture, Agricultural Research Service 2014). For all income groups, the share of healthy and unhealthy macronutrients (protein, carbohydrates, saturated fat, etc.) consumed away from home is about the same as the share of calories consumed away from home, so grocery purchases are not a systematically biased measure of overall diet healthfulness. 6. To measure true changes in grocery availability experienced by consumers, we must use new physical establishments or stores that significantly improve their grocery selection, such as conversions from standard mass merchants to supercenters selling a full line of groceries. To implement this, we use a list of specific TDLinx stores transferred through mergers and acquisitions to exclude spurious “entrants” that were in fact in continuous operation, and we further drop potentially spurious entries where TDLinx shows a store of the same subtype in the same census block in the previous year. 7. We omit the fatty acid ratio from our linearized HEI both because it is less obvious how to linearize this ratio and because saturated fats are already counted directly as a moderation component. 8. Online Appendix Figure A2 shows that these results are similar when using the magnet subsample, which includes bulk purchases as well as packaged items. Online Appendix Figure A3 presents analogues to Figure I considering each individual dietary component in the HEI. Although the HEI imposes specific weights when combining the different dietary components, the figure suggests that higher-income diets would tend to be classified as “more healthy” unless the weights change substantially. 9. This difference is growing over time, increasing from 0.54 in 2004–2007 to 0.61 in 2012–2016. Growing nutritional inequality is consistent with results from the NHANES data (Wang et al. 2014; Rehm et al. 2016) and underscores the importance of this research. 10. This pattern of less-healthy choice sets and store types in low-income neighborhood stores is broadly consistent with a large body of public health literature (e.g., Powell et al. 2007; Larson, Story, and Nelson 2009; Sharkey, Horel, and Dean 2010). To our knowledge, however, we are the first to document this in a data set as comprehensive as RMS. 11. Online Appendix Table A3 demonstrates this in the RMS data. By definition, supercenters carry a full line of groceries, including produce. Club stores such as Sam’s Club, BJ’s, and Costco typically also carry a variety of grocery and produce items. 12. In the working paper versions of this article (Handbury, Rahkovsky, and Schnell 2015; Allcott, Diamond, and Dubé 2017), we showed that while healthy food costs more per calorie than unhealthy food, there is essentially no price difference for categories other than fresh produce. Furthermore, the relative price of healthy versus unhealthy food is actually slightly lower in low-income areas. Therefore, if price plays a role in the nutrition–income relationship, it would have to do so through a preference to reduce produce consumption to economize on calories. 13. According to the 2009 National Household Travel Survey, the median and 75th percentile of shopping travel times are 10 and 15 minutes, respectively. 14. Specifically, |$\boldsymbol{X}_{it}$| includes the natural log of income, natural log of years of education, indicators for each integer age from 23–90, household size, race indicators, an indicator for whether the household heads are married, employment status, and weekly work hours. 15. Online Appendix Figure A4 presents the analogous figures for the τ[10, 15)q coefficients. The effects are smaller, as would be expected given that the entering stores are 10–15 minutes away instead of 0–10 minutes away. Online Appendix Figure A5 presents analogous figures for balanced panel windows including eight (instead of four) quarters before entry. The results are similar. 16. We cannot look at expenditures at the specific entering establishment, as most Homescan panelists report only the retail chain where they shopped. Nielsen then imputes the specific store based on geographic location, but this imputation can be especially unreliable for stores that entered recently. 17. To match the calculations in Section III.A, we define the “bottom income quartile” as households in the lowest quartile of residuals from a regression of household average income across all years observed in the sample on household size and age and year indicators. 18. Online Appendix Table A5 presents additional estimates considering only entry by supercenters, that is, excluding grocery and club stores. As might be expected from looking at larger stores, the expenditure share effects are larger, but the effects on the Health Index are again economically small and mostly statistically insignificant. There is one marginally significant unexpected result at the top right of Table II, suggesting that entry reduces grocery, supercenter, and club store expenditures for households in food deserts. We think of this as an anomaly, as there are no similar results in our prior working papers nor in the supercenter entry study in Online Appendix Table A5. Adding county-by-quarter fixed effects or a control for convenience store entry does not qualitatively change the results. 19. The average Homescan panelist lives in a ZIP code with 4.3 RMS stores and a county with 104 RMS stores. Because the RMS data do not contain the complete census of stores, the distribution of store types in the RMS sample may not match a county’s true distribution. For example, the RMS sample might include most of the grocery stores in county A, but few of the grocery stores and most of the drug stores in county B. To estimate the area average Health Index, we thus take the calorie-weighted average Health Index of groceries sold in RMS stores and regression-adjust for the difference between the distribution of store channel types in the RMS data versus the true distribution of store channel types observed in ZBP data. 20. The 50% local shopping restriction aims to eliminate households that move multiple times within a year or otherwise are less strongly exposed to the retail environment in the county where they report living. 21. Online Appendix Figures A10 and A11, respectively, present an analogue of Figure V for cross–ZIP code moves and results for balanced panel windows that include more years before and after moves. Some of the postmove point estimates are again positive, but there is no statistically significant evidence that the average household’s Health Index converges toward the Health Index of the new area after a move, nor is there evidence of potentially problematic premove trends. 22. As a point of comparison, Online Appendix Table A7 uses this same strategy to replicate the immediate brand choice effect in Bronnenberg, Dubé, and Gentzkow (2012), focusing on Coke versus Pepsi. Specifically, we estimate equation (3) using county-level Coke market shares for Hm, where Coke market shares are defined as [|$\frac{\rm Coke\ calories\ purchased}{\rm(Coke\ and\ Pepsi\ calories\ purchased)}$|]. We estimate a highly statistically significant |$\hat{\tau }\approx 0.16$|⁠, which implies that moving to a county with a 10 percentage point higher Coke market share is associated with a 1.6 percentage point increase in the share of a household’s Coke and Pepsi purchases that are of Coke. 23. DellaVigna and Gentzkow (2019) show that retail chains whose stores are in markets with more inelastic demand tend to charge higher prices than other retail chains whose stores are in markets with more elastic demand. Our market fixed effects are designed to address this form of endogeneity in response to overall market demand patterns. 24. We drop 10.6% of observations at the household-by–product group–by-year level because they have zero purchases. “Baby food” is the product group with the most missing observations. The differences across income groups in a product group’s share of missing observations are not correlated with the product group’s average characteristic contents, suggesting that these dropped observations do not affect our estimated preference heterogeneity across income groups. 25. Specifically, let Qkj denote the total nationwide quantity of calories sold of UPC k in product group j, let Ngs denote the number of trips made by Homescan households of income group g to store s, and let |$\mathbf {1}(kj\in s)$| denote the indicator function for whether product kj is stocked in store s. The Health Index of the representative product is the weighted average of the Health Index for each UPC within the product group: $$\begin{equation} H_{gj}=\frac{\sum _{k}Q_{kj}\sum _{s}N_{gs}\mathbf {1}(kj\in s)H_{kj}}{\sum _{k}Q_{kj}\sum _{s}N_{gs}\mathbf {1}(kj\in s)}. \end{equation}$$ (12) The representative price and observed characteristics are calculated analogously, substituting |$\tilde{p}_{gj}$| and |$\tilde{a}_{gjc}$| for Hkj. 26. The unobserved characteristic |$\hat{\xi }$| represents a combination of the amount of the unobserved characteristic and the income group’s preference for it. Although this is a mix of supply and demand forces, we attribute it to demand because it primarily determines the consumer’s demand elasticity with respect to price, a demand-side force. 27. Our point estimate from Table II, Panel B, column (2), is that one supermarket entry within a 10-minute drive of a bottom income quartile household reduces the top minus bottom quartile Health Index difference by |$\frac{0.005}{0.56}$| ≈ 0.9%. 28. Our demand estimates take the existing SNAP program as given, so this counterfactual should be interpreted as adding a subsidy while leaving the rest of SNAP unchanged. Our estimates provide no evidence to evaluate the existing SNAP program. References Aizer Anna , Currie Janet , “ The Intergenerational Transmission of Inequality: Maternal Disadvantage and Health at Birth ,” Science , 344 ( 2014 ), 856 – 861 . Google Scholar Crossref Search ADS PubMed WorldCat Algert Susan J. , Agrawal Aditya , Lewis Douglas S. , “ Disparities in Access to Fresh Produce in Low-Income Neighborhoods in Los Angeles ,” American Journal of Preventive Medicine , 30 ( 2006 ), 365 – 370 . Google Scholar Crossref Search ADS PubMed WorldCat Allcott Hunt , Diamond Rebecca , Dubé Jean-Pierre , “ The Geography of Poverty and Nutrition: Food Deserts and Food Choices Across the United States ,” NBER Working Paper no. 24094 , 2017 . WorldCat Allcott Hunt , Diamond Rebecca , Dubé Jean-Pierre , Handbury Jessie , Rahkovsky Ilya , Schnell Molly , “ Replication Data for: ‘Food Deserts and the Causes of Nutritional Inequality ’, ” ( 2019 ). Harvard Dataverse, doi:10.7910/DVN/MSOBYI . WorldCat Allcott Hunt , Lockwood Benjamin , Taubinsky Dmitry , “ Regressive Sin Taxes, with an Application to the Optimal Soda Tax ,” Quarterly Journal of Economics , 134 ( 2019 ), 1557 – 1626 . Google Scholar Crossref Search ADS WorldCat Alwitt Linda F. , Donley Thomas D. , “ Retail Stores in Poor Urban Neighborhoods ,” Journal of Consumer Affairs , 31 ( 1997 ), 139 – 164 . Google Scholar Crossref Search ADS WorldCat Anderson Michael , Matsa David , “ Are Restaurants Really Supersizing America? ,” American Economic Journal: Applied Economics , 3 ( 2011 ), 152 – 188 . Google Scholar Crossref Search ADS WorldCat Atkin David , “ The Caloric Costs of Culture: Evidence from Indian Migrants ,” American Economic Review , 106 ( 2016 ), 1144 – 1181 . Google Scholar Crossref Search ADS WorldCat Aussenberg Randy Alison , “ SNAP and Related Nutrition Provisions of the 2014 Farm Bill (P.L. 113-79) ,” Congressional Research Service Report , 2014 . WorldCat Baker E. A. , Schootman M. , Barnidge E. , Kelly C. , “ The Role of Race and Poverty in Access to Foods that Enable Individuals to Adhere to Dietary Guidelines ,” Preventing Chronic Disease , 3 ( 2006 ), A76 . Google Scholar PubMed WorldCat Bartlett Susan , Klerman Jacob , Olsho Lauren , Logan Christopher , Blocklin Michelle , Beauregard Marianne , Enver Ayesha , Wilde Parke , Owens Cheryl , Melhem Margaret , “ Evaluation of the Healthy Incentives Pilot (HIP): Final Report ,” Prepared by Abt Associates for the U.S. Department of Agriculture, Food and Nutrition Service , 2014 . WorldCat Baum Charles , Chou Shin-Yi , “ Why Has the Prevalence of Obesity Doubled? ,” Review of Economics of the Household , 14 ( 2016 ), 251 – 267 . Google Scholar Crossref Search ADS WorldCat Bitler Marianne , Haider Steven , “ An Economic View of Food Deserts in the United States ,” Journal of Policy Analysis and Management , 30 ( 2011 ), 153 – 176 . Google Scholar Crossref Search ADS WorldCat Boss Donna , “ How Independents Can Meet Consumers’ Private Label Desires ,” Supermarket News , August 24, 2018 . WorldCat Bronnenberg Bart J. , Dubé Jean-Pierre , Gentzkow Matthew , “ The Evolution of Brand Preferences: Evidence from Consumer Migration ,” American Economic Review , 102 ( 2012 ), 2472 – 2508 . Google Scholar Crossref Search ADS WorldCat Carlson Andrea C. , Tselepidakis Page Elina , Palmer Zimmerman Thea , Tornow Carina E. , Hermansen Sigurd , “ Linking USDA Nutrition Databases to IRI Household-Based and Store-Based Scanner Data ,” Technical Bulletin TB-1952, U.S. Department of Agriculture, Economic Research Service , 2019 . WorldCat Case Anne , Deaton Angus , “ Rising Morbidity and Mortality in Midlife among White Non-Hispanic Americans in the 21st Century ,” Proceedings of the National Academy of Sciences , 112 ( 2015 ), 15078 – 15083 . Google Scholar Crossref Search ADS WorldCat CDC , “ State Initiatives Supporting Healthier Food Retail: An Overview of the National Landscape ,” 2011 , https://www.cdc.gov/obesity/downloads/Healthier_Food_Retail.pdf . WorldCat Center for Budget Policy Priorities , “ A Quick Guide to SNAP Eligibility and Benefits ,” 2018 . WorldCat Chetty Raj , Hendren Nathan , Kline Patrick , Saez Emmanuel , “ Where Is the Land of Opportunity? The Geography of Intergenerational Mobility in the United States ,” Quarterly Journal of Economics , 129 ( 2014 ), 1553 – 1623 . Google Scholar Crossref Search ADS WorldCat Chetty Raj , Stepner Michael , Abraham Sarah , Lin Shelby , Scuderi Benjamin , Turner Nicholas , Bergeron Augustin , Cutler David , “ The Association Between Income and Life Expectancy in the United States, 2001–2014 ,” Journal of the American Medical Association , 315 ( 2016 ), 1750 – 1766 . Google Scholar Crossref Search ADS PubMed WorldCat Chou Shin-Yi , Grossman Michael , Saffer Henry , “ An Economic Analysis of Adult Obesity: Results from the Behavioral Risk Factor Surveillance System ,” Journal of Health Economics , 23 ( 2004 ), 565 – 587 . Google Scholar Crossref Search ADS PubMed WorldCat Courtemanche Charles J. , Carden Art , “ Competing with Costco and Sam’s Club: Warehouse Club Entry and Grocery Prices ,” NBER Working Paper no. 17220 , 2011 . WorldCat Courtemanche Charles J. , Carden Art , Ndirangu Murugi , Zhou Xilin , “ Do Walmart Supercenters Improve Food Security? ,” NBER Technical Report , 2018 . WorldCat Courtemanche Charles J. , Pinkston Joshua C. , Ruhm Christopher J. , Wehby George L. , “ Can Changing Economic Factors Explain the Rise in Obesity? ,” Southern Economic Journal , 82 ( 2016 ), 1266 – 1310 . Google Scholar Crossref Search ADS WorldCat Cummins Steven , Findlay Anne , Petticrew Mark , Sparks Leigh , “ Healthy Cities: The Impact of Food Retail-led Regeneration on Food Access, Choice, and Retail Structure ,” Built Environment , 31 ( 2005 ), 288 – 301 . Google Scholar Crossref Search ADS WorldCat Currie Janet , DellaVigna Stefano , Moretti Enrico , Pathania Vikram , “ The Effect of Fast Food Restaurants on Obesity and Weight Gain ,” American Economic Journal: Economic Policy , 2 ( 2010 ), 32 – 63 . Google Scholar Crossref Search ADS WorldCat Curtis Colleen , “ First Lady Michelle Obama on Making a Difference in Cities with Food Deserts ,” The White House , October 25, 2011 . WorldCat Cutler David , Lleras-Muney Adriana , “ Understanding Differences in Health Behaviors by Education ,” Journal of Health Economics , 29 ( 2010 ), 1 – 28 . Google Scholar Crossref Search ADS PubMed WorldCat Cutler David M. , Glaeser Edward L. , Shapiro Jesse M. , “ Why Have Americans Become More Obese? ,” Journal of Economic Perspectives , 17 ( 2003 ), 93 – 118 . Google Scholar Crossref Search ADS WorldCat Darmon Nicole , Drewnoski Adam , “ Does Social Class Predict Diet Quality? ,” American Journal of Clinical Nutrition , 87 ( 2008 ), 1107 – 1117 . Google Scholar Crossref Search ADS PubMed WorldCat Davis Brennan , Carpenter Christopher , “ Proximity of Fast-Food Restaurants to Schools and Adolescent Obesity ,” American Journal of Public Health , 99 ( 2009 ), 505 – 510 . Google Scholar Crossref Search ADS PubMed WorldCat DellaVigna Stefano , Gentzkow Matthew , “ Uniform Pricing in US Retail Chains ,” Quarterly Journal of Economics , 134 ( 2019 ), 2011 – 2084 . WorldCat Dubois Pierre , Griffith Rachel , Nevo Aviv , “ Do Prices and Attributes Explain International Differences in Food Purchases? ,” American Economic Review , 2014 ( 2014 ), 832 – 867 . Google Scholar Crossref Search ADS WorldCat Dunn Richard A. , “ The Effect of Fast-Food Availability on Obesity: An Analysis by Gender, Race, and Residential Location ,” American Journal of Agricultural Economics , 92 ( 2010 ), 1149 – 1164 . Google Scholar Crossref Search ADS WorldCat Eid Jean , Overman Henry G. , Puga Diego , Turner Matthew A. , “ Fat City: Questioning the Relationship between Urban Sprawl and Obesity ,” Journal of Urban Economics , 63 ( 2008 ), 385 – 404 . Google Scholar Crossref Search ADS WorldCat Elbel Brian , Moran Alyssa , Beth Dixon L. , Kiszko Kamila , Cantor Jonathan , Abrams Courtney , Mijanovich Tod , “ Assessment of a Government-Subsidized Supermarket in a High-Need Area on Household Food Availability and Children’s Dietary Intakes ,” Public Health Nutrition , 26 ( 2015 ), 1 – 10 . WorldCat Finkelstein Amy , Gentzkow Matthew , Williams Heidi , “ Sources of Geographic Variation in Health Care: Evidence From Patient Migration ,” Quarterly Journal of Economics , 131 ( 2016 ), 1681 – 1726 . Google Scholar Crossref Search ADS PubMed WorldCat Finkelstein Amy , Gentzkow Matthew , Williams Heidi , “ Place-Based Drivers of Mortality: Evidence from Migration ,” MIT Working Paper , 2018a . WorldCat Finkelstein Amy , Gentzkow Matthew , Williams Heidi , “ What Drives Prescription Opioid Abuse? Evidence from Migration ,” SIEPR Working Paper , 2018b . WorldCat Food Trust , “ The Success of HFFI ,” 2019 , http://thefoodtrust.org/what-we-do/administrative/hffi-impacts/the-success-of-hffi. WorldCat Frank John , “ How Safeway Is Building its Own Brands ,” Dairy Foods , 113 ( 2012 ), 64 – 65 . WorldCat Furnee C.A. , Groot W. , van den Brink H.M. , “ The Health Effects of Education: A Meta-Analysis ,” European Journal of Public Health , 18 ( 2008 ), 417 – 421 . Google Scholar Crossref Search ADS PubMed WorldCat Gelbach Jonah B. , “ When Do Covariates Matter? And Which Ones, and How Much? ,” Journal of Labor Economics , 34 ( 2016 ), 509 – 543 . Google Scholar Crossref Search ADS WorldCat Grossman Gene , “ The Relationship between Health and Schooling: What’s New? ,” NBER Working Paper no. 21609 , ( 2015 ). WorldCat Guenther P.M. , Reedy J. , Krebs-Smith S.M. , “ Development of the Healthy Eating Index ,” Journal of the American Dietetic Association , 108 ( 2008 ), 1896 – 1901 . Google Scholar Crossref Search ADS PubMed WorldCat Handbury Jessie , Rahkovsky Ilya , Schnell Molly , “ What Drives Nutritional Disparities? Retail Access and Food Purchases across the Socioeconomic Spectrum ,” NBER Working Paper no. 21126 , 2015 . WorldCat Hausman Jerry , “ Valuation of New Goods under Perfect and Imperfect Competition ,” The Economics of New Goods , Bresnahan Timothy , Gordon Robert J. , eds. ( Chicago: University of Chicago Press , 1996 ), 207 – 248 . Google Preview WorldCat COPAC Horowitz Carol R. , Colson Kathryn A. , Hebert Paul L. , Lancaster Kristie , “ Barriers to Buying Healthy Foods for People with Diabetes: Evidence of Environmental Disparities ,” American Journal of Public Health , 94 ( 2004 ), 1549 – 1554 . Google Scholar Crossref Search ADS PubMed WorldCat Hut Stefan , “ Determinants of Dietary Choice in the US: Evidence from Consumer Migration ,” Brown University, Working Paper , 2018 . WorldCat Jetter Karen M. , Cassady Diana L. , “ The Availability and Cost of Healthier Food Alternatives ,” American Journal of Preventive Medicine , 30 ( 2005 ), 38 – 44 . Google Scholar Crossref Search ADS WorldCat Kliemann N. , Wardle J. , Johnson F. , Croker H. , “ Reliability and Validity of a Revised Version of the General Nutrition Knowledge Questionnaire ,” European Journal of Clinical Nutrition , 70 ( 2016 ), 1174 – 1180 . Google Scholar Crossref Search ADS PubMed WorldCat Kling Jeffrey R. , Liebman Jeffrey B. , Katz Lawrence F. , “ Experimental Analysis of Neighborhood Effects ,” Econometrica , 75 ( 2007 ), 83 – 119 . Google Scholar Crossref Search ADS WorldCat Lakdawalla Darius , Philipson Tomas , Bhattacharya Jay , “ Welfare-Enhancing Technological Change and the Growth of Obesity ,” American Economic Review, Papers and Proceedings , 95 ( 2005 ), 253 – 257 . Google Scholar Crossref Search ADS WorldCat Larson Nicole , Story Mary , Nelson Melissa , “ Neighborhood Environments: Disparities in Access to Healthy Foods in the U.S ,” American Journal of Preventive Medicine , 36 ( 2009 ), 74 – 81 . Google Scholar Crossref Search ADS PubMed WorldCat Lipman Melissa , “ FTC May Waste Time Updating Price-Bias Guide, Attys Say ,” Law360 , November 29, 2012 . WorldCat Ludwig Jens , Sanbonmatsu Lisa , Gennetian Lisa , Adam Emma , Duncan Greg J. , Katz Lawrence F. , Kessler Ronald C. , Kling Jeffrey R. , Lindau Stacy Tessler , Whitaker Robert C. , McDade Thomas W. , “ Neighborhoods, Obesity, and Diabetes—A Randomized Social Experiment ,” New England Journal of Medicine , 365 ( 2011 ), 1509 – 1519 . Google Scholar Crossref Search ADS PubMed WorldCat Molitor David , “ The Evolution of Physician Practice Styles: Evidence from Cardiologist Migration ,” American Economic Journal: Economic Policy , 10 ( 2018 ), 326 – 356 . Google Scholar Crossref Search ADS PubMed WorldCat Okrent Abigail , Kumcu Aylin , “ U.S. Households’ Demand for Convenience Foods ,” USDA Economic Research Report , 2016 . WorldCat Oster Emily , “ Unobservable Selection and Coefficient Stability: Theory and Evidence ,” Journal of Business and Economic Statistics , 37 ( 2019 ), 187 – 204 . Google Scholar Crossref Search ADS WorldCat Parker Gavin , Agostinelli Alessio , Gottstein Holger , Jacobsen Rune , Arouch Youssi , Oberschmidt Kai , “ Why Grocers Need to Start Operating Like Consumer Brands ,” Boston Consulting Group, April 17 , 2018 . WorldCat Powell Lisa M. , Slater Sandy , Mirtcheva Donka , Bao Yanjun , Chaloupka Frank J. , “ Food Store Availability and Neighborhood Characteristics in the United States ,” Preventive Medicine , 44 ( 2007 ), 189 – 195 . Google Scholar Crossref Search ADS PubMed WorldCat Rashad Inas , “ Structural Estimation of Caloric Intake, Exercise, Smoking, and Obesity ,” Quarterly Review of Economics and Finance , 46 ( 2006 ), 268 – 283 . Google Scholar Crossref Search ADS WorldCat Rashad Inas , Grossman Michael , Chou Shin-Yi , “ The Super Size of America: An Economic Estimation of Body Mass Index and Obesity in Adults ,” Eastern Economic Journal , 32 ( 2006 ), 133 – 148 . WorldCat Rehm Colin D. , Penalvo Jose L. , Afshin Ashkan , Mozaffarian Dariush , “ Dietary Intake Among U.S. Adults, 1999–2012 ,” Journal of the American Medical Association , 315 ( 2016 ), 2542 – 2553 . Google Scholar Crossref Search ADS PubMed WorldCat Reinvestment Fund , “ The Healthy Food Financing Initiative ,” 2017 , http://www.healthyfoodaccess.org/resources/library/healthy-food-financing-initiative-hffi. WorldCat Richards Michael R. , Sindelar Jody L. , “ Rewarding Healthy Food Choices in SNAP: Behavioral Economic Applications ,” Milbank Quarterly , 91 ( 2013 ), 395 – 412 . Google Scholar Crossref Search ADS PubMed WorldCat Saez Emmanuel , Piketty Thomas , “ Income Inequality in the United States, 1913–1998 ,” Quarterly Journal of Economics , 118 ( 2003 ), 1 – 39 . Google Scholar Crossref Search ADS WorldCat Schanzenbach Diane Whitmore , “ Pros and Cons of Restricting SNAP Purchases ,” Brookings Institution, February 16, 2017 , https://www.brookings.edu/testimonies/pros-and-cons-of-restricting-snap-purchases/. WorldCat Sharkey Joseph R. , Horel Scott , Dean Wesley R. , “ Neighborhood Deprivation, Vehicle Ownership, and Potential Spatial Access to a Variety of Fruits and Vegetables in a Large Rural Area in Texas ,” International Journal of Health Geographics , 9 ( 2010 ), 1 – 27 . Google Scholar Crossref Search ADS PubMed WorldCat Shenkin Jonathan D. , Jacobson Michael F. , “ Using the Food Stamp Program and Other Methods to Promote Healthy Diets for Low-Income Consumers ,” American Journal of Public Health , 100 ( 2010 ), 1562 – 1564 . Google Scholar Crossref Search ADS PubMed WorldCat Song Hee-Jung , Gittelsohn Joel , Kim Miyong , Suratkar Sonali , Sharma Sangita , Anliker Jean , “ A Corner Store Intervention in a Low-Income Urban Community Is Associated with Increased Availability and Sales of Some Healthy Foods ,” Public Health Nutrition , 12 ( 2009 ), 2060 – 2067 . Google Scholar Crossref Search ADS PubMed WorldCat U.S. Department of Agriculture, Agricultural Research Service , “ Away from Home: Percentages of Selected Nutrients Contributed by Food and Beverages Consumed away from Home, by Family Income (in Dollars) and Age. What We Eat in America, NHANES 2011–2012 ,” 2014 , http://www.ars.usda.gov/Services/docs.htm?docid=18349. WorldCat U.S. Department of Agriculture, Agricultural Research Service , “ USDA Food and Nutrient Database for Dietary Studies 2015–2016 ,” 2016 , https://www.ars.usda.gov/northeast-area/beltsville-md-bhnrc/beltsville-human-nutrition-research-center/food-surveys-research-group/docs/fndds-download-databases/. WorldCat U.S. Department of Agriculture, Agricultural Research Service , “ USDA National Nutrient Database for Standard Reference, Release 28 ,” 2018 , http://www.ars.usda.gov/ba/bhnrc/ndl. WorldCat U.S. Department of Health and Human Services , “ FoodKeeper App ,” 2015, https://www.foodsafety.gov/keep/foodkeeperapp/index.html. WorldCat Volpe Richard , Okrent Abigail , “ Assessing the Healthfulness of Consumers’ Grocery Purchases ,” USDA Economic Information Bulletin ( 2012 ). WorldCat Volpe Richard , Okrent Abigail , Leibtag Ephraim , “ The Effect of Supercenter-Format Stores on the Healthfulness of Consumers’ Grocery Purchases ,” American Journal of Agricultural Economics , 95 ( 2013 ), 568 – 589 . Google Scholar Crossref Search ADS WorldCat Wang Dong D. , Leung Cindy W. , Li Yanping , Ding Eric L. , Chiuve Stephanie E. , Hu Frank B. , Willett Walter C. , “ Trends in Dietary Quality among Adults in the United States, 1999 through 2010 ,” JAMA Internal Medicine , 174 ( 2014 ), 1587 – 1595 . Google Scholar Crossref Search ADS PubMed WorldCat Watson Elaine , “ No Light At the End of the Tunnel for Dean Foods as Milk Price Wars Escalate, Says Bernstein ,” Food Navigator, August 7, 2017 . WorldCat Weatherspoon Dave , Oehmke James , Dembele Assa , Coleman Marcus , Satimanon Thasanee , Weatherspoon Lorraine , “ Price and Expenditure Elasticities for Fresh Fruits in an Urban Food Desert ,” Urban Studies , 50 ( 2012 ), 88 – 106 . Google Scholar Crossref Search ADS WorldCat Wrigley Neil , Warm Daniel , Margetts Barrie , “ Deprivation, Diet, and Food Retail Access: Findings from the Leeds ‘Food Deserts’ Study ,” Environment and Planning A , 35 ( 2003 ), 151 – 188 . Google Scholar Crossref Search ADS WorldCat Yang Quanhe , Zhang Zefeng , Gregg Edward W. , Flanders W. Dana , Merritt Robert , Hu Frank B. , “ Added Sugar Intake and Cardiovascular Diseases Mortality Among US Adults ,” Journal of the American Medical Association Internal Medicine , 174 ( 2014 ), 516 – 524 . Google Scholar PubMed WorldCat © The Author(s) 2019. Published by Oxford University Press on behalf of President and Fellows of Harvard College. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) TI - Food Deserts and the Causes of Nutritional Inequality JF - The Quarterly Journal of Economics DO - 10.1093/qje/qjz015 DA - 2019-11-01 UR - https://www.deepdyve.com/lp/oxford-university-press/food-deserts-and-the-causes-of-nutritional-inequality-wAQLcGFBSk SP - 1793 VL - 134 IS - 4 DP - DeepDyve ER -