# Quantifying Sentiment with News Media across Local Housing Markets

Quantifying Sentiment with News Media across Local Housing Markets Abstract This paper develops first measures of housing sentiment for 34 cities across the United States by quantifying the qualitative tone of local housing news. Housing media sentiment has significant predictive power for future house prices, leading prices by nearly two years. Consistent with theories of investor sentiment, the media sentiment index has a greater effect in markets in which speculative investors are more prevalent and demand appears less informed. Directly examining the content across news articles finds that results are not driven by news stories of unobserved fundamentals. Received October 21, 2015; editorial decision December 19, 2017 by Editor Robin Greenwood. Authors have furnished an Internet Appendix, which is available on the Oxford University Press Web site next to the link to the final published paper online. Sentiment, broadly defined as the psychology behind investor beliefs, has been long posited as a determinant of asset price variation (Keynes 1936). The potential role of sentiment is particularly important for the housing market, where financial errors can have very large consequences. Over two-thirds of households in the United States own a home and invest the majority of their portfolio in real estate (Tracy, Schneider, and Chan 1999, Nakajima 2005). The housing bust in 2006 devastated millions of homeowners across the United States and overwhelmed banks and financial institutions that held significant investments in mortgage-backed securities and other housing related assets. The collapse of the subprime mortgage industry quickly followed, leading the nation into its worst economic recession since the 1930s. Shiller (2009) and others notably argue that “animal spirits” or an irrational exuberance of investors was a significant factor in the dramatic boom and bust of house prices in some markets. Testing this theory, however, is difficult without empirical measures of housing sentiment. Beliefs are not only unobservable and therefore not straightforward to quantify but also sentiment measures for the housing market are particularly difficult to obtain. Typical sentiment proxies for the stock market, such as mutual fund flows, dividend premiums, and closed-end fund discounts, are naturally not available for the housing market (Baker and Wurgler 2006). Survey measures that are available are limited in geographic scope. While many cities experienced unprecedented price growth in the last housing cycle, many others simultaneously saw minimal or stable movements in prices (Ferreira and Gyourko 2012). Therefore understanding the variation of price growth across local markets requires analogous city-level measures. The goal of this paper is to provide real-time local measures of sentiment across a number of city housing markets. I construct 34 city-specific housing sentiment indices corresponding to major metropolitan areas across the United States. I measure housing sentiment by quantifying the qualitative tone of local housing news media coverage. This strategy is motivated by seminal literature on asset price bubbles that argues news media has a prominent relationship with sentiment through an incentive to cater to readers’ preferences (Kindleberger 1978, Galbraith 1990, Shiller 2005). This methodology is further supported by pioneering empirical work by Tetlock (2007) and others who have found media tone to be a consistent proxy for sentiment in the stock market. By analyzing local housing news articles each month, I quantify the share of the positive and negative local media tone in each market from 2000 to the end of 2013. To my knowledge, this paper contributes the first set of city measures of housing sentiment across several markets. Housing media sentiment indices move ahead of price levels at a significant lead. Cities that experienced dramatic rises and declines in house prices are preceded by similar cycles in sentiment, whereas cities with milder price changes are led by more subdued or random patterns in sentiment growth. In cities with large swings in prices, sentiment appears to peak approximately some time in 2004. Though only available at a national and limited city sample, I validate my sentiment index against two available surveys of housing market expectations. I find that my media sentiment index highly correlates with the national survey and with the four cities assessed by the Case, Shiller, and Thompson (2012) survey. Both surveys also see expectations peak in early 2004, at a similar lead-lag pattern to house price levels. I further validate my sentiment index against an alternative multifactor index created with a set of proxies following the methodology of Baker and Wurgler (2007) and find that these also correlate highly. Consistent with these leading patterns, I find that housing media sentiment has robust predictive power for future house price growth. I find these effects to be both statistically significant and large in economic magnitude. This is notable as historical predictive factors have had difficulty explaining the wide swings in house prices during this period. Fundamental determinants that traditionally explained past patterns in house prices account for just a small fraction of changes post-2000 (Lai and Van Order 2010). From 2004 to 2006, for example, house prices in Miami increased by nearly 54%. Observed economic fundamentals account for approximately 21 percentage points of this growth, while the media sentiment index explains an additional 32 percentage points. These results suggest that media sentiment serves as a useful predictive factor for house prices above and beyond traditional observed variables. The positive association between the media sentiment index and future house prices is consistent with two different interpretations. One explanation is that these results capture the effects of sentiment, where over-exuberant beliefs pushed prices away from fundamentals. Examining the structure of the media sentiment index more closely reveals a backward-looking nature over past returns consistent with behavioral theories of extrapolative expectations. An alternative interpretation is that local unobserved fundamentals simultaneously drove house prices and beliefs such that the expectations of home buyers were justified by local information. One could even argue that the leading pattern of the media sentiment index resembles a perfect forecast in some cities. The media sentiment index could instead serve as a valuable indicator of local market information that was otherwise unobserved. I subsequently perform a series of tests to explore these two competing explanations. I first exploit the textual nature of my data to directly evaluate the media index as a proxy for local information. By tracking all words across all articles, I can examine the content across articles firsthand. I then separate the articles to create a set of “media fundamentals” indices that track the positive and negative tone across articles that discuss any relevant housing market information. I find that the effect of the housing sentiment index remains extremely robust to a discussion of fundamentals, both separately and together. The only set of articles that shares some predictive power with the media sentiment index are those that discuss credit conditions. The inclusion of additional controls for subprime lending patterns and the availability of credit, however, does not affect the significance and magnitude of the predictive effect of the housing media sentiment index. In the final section, I take advantage of the availability of the cross-sectional indices to test effects that should be consistent with an interpretation of sentiment. Baker and Wurgler (2006) highlight two channels through which theory predicts sentiment has cross-sectional effects on prices: (1) where demand is less informed and (2) where arbitrage constraints are particularly binding. Since arbitrage is particularly binding in the housing market, theoretical models of investor sentiment predict that markets with less-informed demand should be subject to greater effects of sentiment. Shiller (2009) and others have raised concerns particularly over the disproportionate lack of access to financial advice available to minority and low-income households. Consistent with these arguments, I find that sentiment effects are significantly greater in markets with more minority mortgage applicants and among lower-priced homes. Theories on sentiment further suggest that rational traders may respond to increased sentiment in the market to anticipate increasing asset prices. Indeed, I find that the effect of sentiment is also notably larger among housing markets with a greater entrance of speculative investors. Finally, I show that while subprime lending patterns do not affect the impact of sentiment on prices, the effect of sentiment is proportionally greater in markets where subprime lending was more prevalent. Many voiced suspicions that subprime lenders targeted those most uninformed, and we would expect those subject to sentiment to be more vulnerable to taking out risky, subprime loans. These results also provide potential context for why prior studies have been unable to explain the magnitude of the boom with the ease of credit alone, and highlight potential subprime lending effects through a relationship with sentiment. 1. The Housing Sentiment Index 1.1 Background and related literature Prominent literature on bubbles and panics stresses that the news media have an important relationship with investor beliefs (Kindleberger 1978, Galbraith 1990, Shiller 2005). They argue that newspapers have a demand-side incentive to cater to reader preferences and will reflect readers’ expectations over assets they own. Mullainathan and Shleifer (2005) formalize these arguments by assuming readers have a disutility for news that is inconsistent with their beliefs, citing psychology literature that shows people have a tendency to favor information that confirms their priors. Gentzkow and Shapiro (2010) find empirical evidence that readers have a preference for news consistent with their political beliefs and that news outlets respond accordingly. At the same time, Shiller (2005) argues that news outlets have the power to influence reader beliefs through a chosen tone or an emphasis of particular positive or negative events. Empirical studies of stock market sentiment have found consistent evidence for these arguments linking media coverage and market activity. Tetlock (2007) employs textual analysis to quantify media sentiment in the stock market and finds consistent evidence for its predictive effect on asset prices. Dougal et al. (2012) provide further causal evidence of financial media influencing subsequent investor behavior and stock market returns. Additional studies have further applied textual analysis techniques to capture the sentiment of earnings announcements, investor chat rooms, corporate 10-K reports, and initial public offering (IPO) prospectuses and linked to outcomes such as firm earnings, stock returns, and trading volume (Engelberg 2008, Antweiler and Murray 2004, Li 2006, Hanley and Hoberg 2010). Housing is an asset that receives heavy media attention as “a source of endless fascination for the general public, because we live in houses, we work on them every day” (Shiller 2005, p. 102). Housing is also a widely held investment by individual buyers who may be more likely subject to media slant than the typical stock market investor. Media tone thus presents a unique opportunity to capture expectations in a market where alternative proxies are otherwise difficult to obtain. Surveys such as Case and Shiller (2003) and the Michigan/Reuters Survey of Consumers (SOC) can provide direct assessments of home buyer expectations, but can be expensive to run and are therefore limited in frequency and/or restricted in geographic breakdown. Because news is local and recurring, the news media provides a potential medium through which we can quantify housing sentiment both across time and city by city. Examining the impact of sentiment across housing markets is particularly important after the most recent housing cycle. Standard economic explanations for the housing boom have so far been difficult to reconcile empirically (Gerardi et al. 2008, Glaeser, Gottlieb, and Gyourko 2010, Ferreira and Gyourko 2011). Observed fundamentals that accounted for nearly 70% of the variation in national house price growth from 1987 to 2000, explain less than 10% of the variation from 2000 to 2011 (Lai and Van Order 2010). While there was much discussion over the potential role of sentiment in the last housing cycle, empirical evidence of this theory has been limited and largely anecdotal because of the lack of housing sentiment measures available. This paper thus provides a first opportunity to examine the role of sentiment in housing both across markets and across time empirically. 1.2 Calculating the housing sentiment index My source for news articles is Factiva.com, a comprehensive online database of newspapers. Factiva categorizes its articles by subject, and provides a subject code that identifies articles that discuss local real estate markets. This code is determined by a proprietary algorithm that remains objective across all newspapers and years. Routine real estate property listings are not included. Wire service articles (such as those by the Associated Press) are also generally excluded, as syndicated stories cannot be redistributed and typically do not appear in the Factiva database. Thus, articles in my sample are those written by local staff reporters. I collect all news articles discussing real estate markets between January 2000 and December 2013 from the major newspaper publications of 34 cities, including the twenty cities followed by the Case-Shiller home price indices. Most cities have one major newspaper that dominates the news market, with the exception of Boston, Detroit, and Los Angeles, which have two. Table 1 lists all newspaper sources for each city. I extract the full text of each article and record each individual word with its corresponding date, word position, and originating newspaper. My final data set assembles 37,537 newspaper articles, consisting of over 29 million words. Table 1 Major newspaper publications by city City Newspaper Atlanta The Atlanta Journal-Constitution Austin Austin American-Statesman Baltimore The Baltimore Sun Boston Boston Herald/Boston Globe Charlotte The Observer Chicago Chicago Tribune Cincinnati The Cincinnati Enquirer Cleveland The Plain Dealer Columbus (OH) The Columbus Dispatch Dallas The Dallas Morning News Denver The Denver Post Detroit Detroit News/Detroit Free Press Indianapolis The Indianapolis Star Kansas City The Kansas City Star Los Angeles LA Times/LA Daily News Las Vegas Las Vegas Review-Journal Milwaukee The Milwaukee Journal Sentinel Minneapolis Star Tribune Miami The Miami Herald NYC New York Times Orlando Orlando Sentinel Philadelphia The Philadelphia Inquirer Phoenix The Arizona Republic Pittsburgh Pittsburgh Post-Gazette Portland The Oregonian Sacramento Sacramento Bee San Antonio San Antonio Express-News San Diego The San Diego Union-Tribune San Francisco The San Francisco Chronicle San Jose San Jose Mercury News Seattle The Seattle Times St. Louis St. Louis Post-Dispatch Tampa Tampa Tribune Washington, D.C The Washington Post City Newspaper Atlanta The Atlanta Journal-Constitution Austin Austin American-Statesman Baltimore The Baltimore Sun Boston Boston Herald/Boston Globe Charlotte The Observer Chicago Chicago Tribune Cincinnati The Cincinnati Enquirer Cleveland The Plain Dealer Columbus (OH) The Columbus Dispatch Dallas The Dallas Morning News Denver The Denver Post Detroit Detroit News/Detroit Free Press Indianapolis The Indianapolis Star Kansas City The Kansas City Star Los Angeles LA Times/LA Daily News Las Vegas Las Vegas Review-Journal Milwaukee The Milwaukee Journal Sentinel Minneapolis Star Tribune Miami The Miami Herald NYC New York Times Orlando Orlando Sentinel Philadelphia The Philadelphia Inquirer Phoenix The Arizona Republic Pittsburgh Pittsburgh Post-Gazette Portland The Oregonian Sacramento Sacramento Bee San Antonio San Antonio Express-News San Diego The San Diego Union-Tribune San Francisco The San Francisco Chronicle San Jose San Jose Mercury News Seattle The Seattle Times St. Louis St. Louis Post-Dispatch Tampa Tampa Tribune Washington, D.C The Washington Post Table 1 lists each city and its corresponding newspaper source in my sample of housing news articles ($${\rm N}=37,537$$). I draw from one major newspaper publication for most cities, with the exception of Boston, Detroit, and Los Angeles, in which I draw from the two major newspapers in the area. My sample covers articles from January 2000 to December 2013. Table 1 Major newspaper publications by city City Newspaper Atlanta The Atlanta Journal-Constitution Austin Austin American-Statesman Baltimore The Baltimore Sun Boston Boston Herald/Boston Globe Charlotte The Observer Chicago Chicago Tribune Cincinnati The Cincinnati Enquirer Cleveland The Plain Dealer Columbus (OH) The Columbus Dispatch Dallas The Dallas Morning News Denver The Denver Post Detroit Detroit News/Detroit Free Press Indianapolis The Indianapolis Star Kansas City The Kansas City Star Los Angeles LA Times/LA Daily News Las Vegas Las Vegas Review-Journal Milwaukee The Milwaukee Journal Sentinel Minneapolis Star Tribune Miami The Miami Herald NYC New York Times Orlando Orlando Sentinel Philadelphia The Philadelphia Inquirer Phoenix The Arizona Republic Pittsburgh Pittsburgh Post-Gazette Portland The Oregonian Sacramento Sacramento Bee San Antonio San Antonio Express-News San Diego The San Diego Union-Tribune San Francisco The San Francisco Chronicle San Jose San Jose Mercury News Seattle The Seattle Times St. Louis St. Louis Post-Dispatch Tampa Tampa Tribune Washington, D.C The Washington Post City Newspaper Atlanta The Atlanta Journal-Constitution Austin Austin American-Statesman Baltimore The Baltimore Sun Boston Boston Herald/Boston Globe Charlotte The Observer Chicago Chicago Tribune Cincinnati The Cincinnati Enquirer Cleveland The Plain Dealer Columbus (OH) The Columbus Dispatch Dallas The Dallas Morning News Denver The Denver Post Detroit Detroit News/Detroit Free Press Indianapolis The Indianapolis Star Kansas City The Kansas City Star Los Angeles LA Times/LA Daily News Las Vegas Las Vegas Review-Journal Milwaukee The Milwaukee Journal Sentinel Minneapolis Star Tribune Miami The Miami Herald NYC New York Times Orlando Orlando Sentinel Philadelphia The Philadelphia Inquirer Phoenix The Arizona Republic Pittsburgh Pittsburgh Post-Gazette Portland The Oregonian Sacramento Sacramento Bee San Antonio San Antonio Express-News San Diego The San Diego Union-Tribune San Francisco The San Francisco Chronicle San Jose San Jose Mercury News Seattle The Seattle Times St. Louis St. Louis Post-Dispatch Tampa Tampa Tribune Washington, D.C The Washington Post Table 1 lists each city and its corresponding newspaper source in my sample of housing news articles ($${\rm N}=37,537$$). I draw from one major newspaper publication for most cities, with the exception of Boston, Detroit, and Los Angeles, in which I draw from the two major newspapers in the area. My sample covers articles from January 2000 to December 2013. I capture media tone through a textual analysis of the content across newspaper articles. Textual analysis is an increasingly popular methodology used to quantify the tone of financial documents (Antweiler and Murray 2004, Loughran and McDonald 2011, Jegadeesh and Wu 2013, Hanley and Hoberg 2010). I apply the most standard methodology employed by this literature, which uses a dictionary-based method to quantify the raw frequency of positive and negative words in a text. To do so, these papers typically identify words as positive or negative based on an external word list. External word lists such as those from the Harvard IV-4 Psychological Dictionary are preferred because they are predetermined and less vulnerable to subjectivity from the author. Recent studies have argued, however, that these general tonal lists can at times contain irrelevant words and lead to noisy measures (Tetlock, Saar-Tsechansky, and Macskassy 2008). For example, Engelberg (2008) points out that Harvard IV-4 positive list, which contains word such as company or shares, can unintentionally capture other effects in finance applications. Loughran and McDonald (2011) show that the noise introduced by the general Harvard negative word list can also be substantial and argues that word lists should be discipline specific to reduce measurement error. To balance these concerns, I employ a predetermined list from the Harvard IV-4 dictionary to reduce subjectivity, but choose one that is to relevant to how the media expresses positive or negative tone over housing. Shiller (2008) suggests that the media embellishes market activity by employing superlatives that emphasize increases and upward movements. For example, my sample includes articles with headlines such as “Home Sales Skyrocket!”, “Home Prices Zoom Up”, or “Housing is HOT, HOT, HOT!!” Thus to capture words like skyrocket, zoom, or hot, I use the Harvard IV-4 lists Increase and Rise as a base set.1 I remove any words from the original list that would result in obvious misclassifications, and then expand the remaining words with corresponding superlatives. Following Loughran and McDonald (2011), I also expand the list with inflections and tenses that retain the original meaning of each word. Word counts for the root word skyrocket, for example, also include skyrockets, skyrocketed, and skyrocketing. The original Harvard IV-4 lists include 136 words and the expanded list, including inflections and synonyms, contains 403 words.2 I repeat the above process to create negative word lists using the converse Harvard IV-4 lists Decrease and Fall. Table 2 reports the top twenty most frequently occurring positive and negative words in my sample. Table 2 Top twenty most frequent positive and negative words Positive % of pos words Negative % of neg words 1 UP 9.05 DOWNS 16.30 2 HIGHS 5.93 LOW 11.91 3 INCREASING 5.01 FALLING 10.27 4 RISE 4.58 DECLINING 8.58 5 GREAT 4.34 DROPPING 7.68 6 SUSTAINS 4.08 FORECLOSING 4.45 7 MOST 3.44 SLOW 3.78 8 BIGGEST 3.33 CONTRACT 2.57 9 FRENZINESS 2.92 RECESSION 2.04 10 FASTEST 2.76 BUBBLE 1.53 11 BEST 2.57 DIPS 1.44 12 PUSHING 2.51 CONCERNING 1.32 13 RECORD 2.34 FLATTENING 1.25 14 HEALTHIER 2.22 SLUMPED 1.21 15 STRENGTHEN 2.02 WORRYING 1.21 16 GOOD 1.98 STOPPING 1.18 17 BOOMING 1.84 REDUCTION 1.17 18 WELL 1.83 COOL 1.11 19 GAIN 1.63 CRISIS 1.03 20 VERY 1.48 WEAKEN 1.00 Positive % of pos words Negative % of neg words 1 UP 9.05 DOWNS 16.30 2 HIGHS 5.93 LOW 11.91 3 INCREASING 5.01 FALLING 10.27 4 RISE 4.58 DECLINING 8.58 5 GREAT 4.34 DROPPING 7.68 6 SUSTAINS 4.08 FORECLOSING 4.45 7 MOST 3.44 SLOW 3.78 8 BIGGEST 3.33 CONTRACT 2.57 9 FRENZINESS 2.92 RECESSION 2.04 10 FASTEST 2.76 BUBBLE 1.53 11 BEST 2.57 DIPS 1.44 12 PUSHING 2.51 CONCERNING 1.32 13 RECORD 2.34 FLATTENING 1.25 14 HEALTHIER 2.22 SLUMPED 1.21 15 STRENGTHEN 2.02 WORRYING 1.21 16 GOOD 1.98 STOPPING 1.18 17 BOOMING 1.84 REDUCTION 1.17 18 WELL 1.83 COOL 1.11 19 GAIN 1.63 CRISIS 1.03 20 VERY 1.48 WEAKEN 1.00 The word counts for each listed word includes different tenses and inflections. So, for example, “boom” includes counts for “booms,” “boomed,” and “booming.” Table 2 Top twenty most frequent positive and negative words Positive % of pos words Negative % of neg words 1 UP 9.05 DOWNS 16.30 2 HIGHS 5.93 LOW 11.91 3 INCREASING 5.01 FALLING 10.27 4 RISE 4.58 DECLINING 8.58 5 GREAT 4.34 DROPPING 7.68 6 SUSTAINS 4.08 FORECLOSING 4.45 7 MOST 3.44 SLOW 3.78 8 BIGGEST 3.33 CONTRACT 2.57 9 FRENZINESS 2.92 RECESSION 2.04 10 FASTEST 2.76 BUBBLE 1.53 11 BEST 2.57 DIPS 1.44 12 PUSHING 2.51 CONCERNING 1.32 13 RECORD 2.34 FLATTENING 1.25 14 HEALTHIER 2.22 SLUMPED 1.21 15 STRENGTHEN 2.02 WORRYING 1.21 16 GOOD 1.98 STOPPING 1.18 17 BOOMING 1.84 REDUCTION 1.17 18 WELL 1.83 COOL 1.11 19 GAIN 1.63 CRISIS 1.03 20 VERY 1.48 WEAKEN 1.00 Positive % of pos words Negative % of neg words 1 UP 9.05 DOWNS 16.30 2 HIGHS 5.93 LOW 11.91 3 INCREASING 5.01 FALLING 10.27 4 RISE 4.58 DECLINING 8.58 5 GREAT 4.34 DROPPING 7.68 6 SUSTAINS 4.08 FORECLOSING 4.45 7 MOST 3.44 SLOW 3.78 8 BIGGEST 3.33 CONTRACT 2.57 9 FRENZINESS 2.92 RECESSION 2.04 10 FASTEST 2.76 BUBBLE 1.53 11 BEST 2.57 DIPS 1.44 12 PUSHING 2.51 CONCERNING 1.32 13 RECORD 2.34 FLATTENING 1.25 14 HEALTHIER 2.22 SLUMPED 1.21 15 STRENGTHEN 2.02 WORRYING 1.21 16 GOOD 1.98 STOPPING 1.18 17 BOOMING 1.84 REDUCTION 1.17 18 WELL 1.83 COOL 1.11 19 GAIN 1.63 CRISIS 1.03 20 VERY 1.48 WEAKEN 1.00 The word counts for each listed word includes different tenses and inflections. So, for example, “boom” includes counts for “booms,” “boomed,” and “booming.” I calculate the overall tone of housing news in each city $$i$$ and period $$t$$ by $$S_{it}=\frac{\#pos-\#neg}{\#totalwords}_{it},$$ (1) that is, the number of positive minus negative words divided by the total number of words across all housing articles in each period.3 I additionally adjust both negative and positive word counts for negation using the terms: no, not, none, neither, never, and nobody. I consider a word negated if it is preceded within five words by one of these negation terms.4 The above calculation represents the most raw and baseline index of media tone. Loughran and McDonald (2011) propose an alternative “term-weighted” index that also adjusts for the commonality and frequency of a word across documents. I find that this and other alternative variations are highly correlated with the raw baseline version above.5 1.3 Survey validation Validating measures of housing sentiment is by nature challenging when beliefs are unobservable and alternative proxies are otherwise rare. Existing surveys of home buyer confidence cannot validate each measure city by city, but can be used to compare overall trends on the national level. The University of Michigan/Reuters Survey of Consumers (SOC) surveys a nationally representative sample of 500 individuals each month on their attitudes toward business and buying conditions, including those of the housing market. Specifically, the SOC asks, “Generally speaking, do you think now is a good or bad time to buy a house?” Respondents answer “yes,” “no, “or “do not know.” Figure 1 plots the percentage of respondents that answered “yes” against a national version of my housing expectations index. I calculate a national index of media tone using the same weights applied to the twenty cities in Case-Shiller Composite-20 home price index. My measure of housing media sentiment reveals a strikingly similar pattern to the SOC survey measure. Both measures increase rapidly from 2000 to 2003, peaking in early 2004. Both fall rapidly from 2004 to 2006, dropping well below original levels of confidence in 2000. Media sentiment appears to generally lag survey confidence on average by 2 to 6 months, consistent with theories that the media responds to reader preferences in the housing market. Both measures rebound early 2008, peaking and declining slightly again in late 2009. Both of the increases occur before the temporary rebound of the housing market in 2009, but fall slightly again afterward. The correlation between the SOC survey and my media housing sentiment index is equal to approximately 0.84. Figure 1 View largeDownload slide Validating media housing sentiment with survey of consumers This figure compares the patterns of the composite-20 national housing media sentiment index with the Michigan/Reuters Survey of Consumers (SOC) survey of home buyers. The SOC surveys a nationally representative sample of 500 consumers and asks whether they think it is a good or bad time to buy a home. The SOC cannot be broken down by city, but provides a validation of the media sentiment index on a national level. The dashed line plots the SOC relative index, which equals the percentage who answered “Good” - ”Bad” + 100. The solid line plots the housing media sentiment index, which equals the fraction of “Positive” - “Negative” words across all housing news articles per month. The media sentiment index lags the SOC survey index by a few months but mirrors a very similar pattern. The correlation between the SOC survey and (lagged) media sentiment is approximately 0.84. Figure 1 View largeDownload slide Validating media housing sentiment with survey of consumers This figure compares the patterns of the composite-20 national housing media sentiment index with the Michigan/Reuters Survey of Consumers (SOC) survey of home buyers. The SOC surveys a nationally representative sample of 500 consumers and asks whether they think it is a good or bad time to buy a home. The SOC cannot be broken down by city, but provides a validation of the media sentiment index on a national level. The dashed line plots the SOC relative index, which equals the percentage who answered “Good” - ”Bad” + 100. The solid line plots the housing media sentiment index, which equals the fraction of “Positive” - “Negative” words across all housing news articles per month. The media sentiment index lags the SOC survey index by a few months but mirrors a very similar pattern. The correlation between the SOC survey and (lagged) media sentiment is approximately 0.84. Case, Shiller, and Thompson (2012) implement surveys that provide even more detailed perspectives on investor expectations. They directly ask respondents how much they expect their house price to grow over the next ten years. Answers reveal astonishingly high expectations; with respondents expecting prices to rise an average of 11% to 13% each year. Survey expenses limit coverage to four suburban areas and an annual snapshot, but nonetheless provide a valuable opportunity to validate my media index on a local level for at least four cities. Figure 2 plots the Case-Shiller survey measures against my housing sentiment index for San Francisco, Los Angeles, Boston, and Milwaukee. As with national trends, my sentiment index exhibits parallel patterns to survey measures on a city level. Both the media sentiment index and survey measures of expectations in San Francisco, Los Angeles, and Boston similarly peak in 2004 and hit a low point in 2008 before rising again. Measures for Milwaukee display more moderate patterns from 2003 to 2006, both in the survey and media index. Correlations between my housing sentiment index and the Case-Shiller Survey for each city range from approximately 0.7 to 0.74. Figure 2 View largeDownload slide Validating media housing sentiment with Case-Shiller surveys This figure compares the patterns of the housing media sentiment index with housing surveys by Case, Shiller, and Thompson (2012) who annually survey home buyer expectations in four cities from 2003 to 2012. The bars plot the percentage respondents think home prices will increase or decrease over the next year. The Case-Shiller survey is limited to an annual frequency, but provides a validation of the media sentiment index for four cities at the city level. The solid line plots the housing media sentiment index, which equals the percentage of positive minus negative words across all housing news articles per month. The media sentiment generally follows a very similar trending pattern city by city. The correlation between the Case-Shiller Survey and media index is equal to approximately 0.74 across cities. Figure 2 View largeDownload slide Validating media housing sentiment with Case-Shiller surveys This figure compares the patterns of the housing media sentiment index with housing surveys by Case, Shiller, and Thompson (2012) who annually survey home buyer expectations in four cities from 2003 to 2012. The bars plot the percentage respondents think home prices will increase or decrease over the next year. The Case-Shiller survey is limited to an annual frequency, but provides a validation of the media sentiment index for four cities at the city level. The solid line plots the housing media sentiment index, which equals the percentage of positive minus negative words across all housing news articles per month. The media sentiment generally follows a very similar trending pattern city by city. The correlation between the Case-Shiller Survey and media index is equal to approximately 0.74 across cities. 1.4 Multifactor index The high correlation of the sentiment index with Case, Shiller, and Thompson (2012) home buyer surveys suggest that the media sentiment reflects home buyer expectations. The housing market is more complicated, however, and has a number of other types of agents—lenders, banks, speculative investors—who are also simultaneously part of potential newspaper readership. All of these agents’ expectations could also be influenced or reflected in news media articles, and therefore also be represented by the media sentiment index. The media index thus likely represents multiple agents sentiment in the housing market, and can even represent its own, as journalists and editors are also all potential buyers or sellers in the housing market. To validate the media sentiment index further, I create a multifactor sentiment index following the methodology in Baker and Wurgler (2006) that combines sentiment factors that may come from not only homebuyers but also lenders, banks or outside investors/speculators. I employ the following five potential proxies for sentiment to capture these different aspects of housing sentiment. Transaction volume: If home buyers are extrapolating prices and buying home(s) in a frenzy, we would expect to see this activity reflected in increased trading volume. Number of 2nd+ home purchases: Similarly if investors are extrapolating prices and buying as speculators, we would see an increase in second or more multiple home purchases. During the housing boom, there were many anecdotal stories of individuals buying 2+ more homes with high expectations for future house prices. This could also capture activity of rational speculators looking to flip homes for investment, as a bet on sentiment from less sophisticated home buyers continuing to rise in the short-term. Fraction of subprime mortgages: Sentiment could be reflected not only in buyers but also in lending activity. If lenders are over-confident that prices will rise, they may be more willing to lend out riskier mortages. Following Ferreira and Gyourko (2012), this variable captures the share of loans issued by the top twenty subprime lenders ranked by Inside Mortgage Finance. Loan-to-value ratio: If lenders are also affected by high market sentiment, then this could lead to egregious lending activity captured by absurdly high loan-to-value ratios. The loan-to-value ratios I include in this multifactor index incorporate not just the primary but any additional (up to 3) loans taken out for a particular housing transaction. Price-to-rent ratio: During the boom of house prices, Shiller (2005) and others frequently cited high price-to-rent ratios as evidence of overvalued housing markets. The “fundamental value” of an asset typically refers to its present discounted value of future cash flow, which models of housing assume housing pays dividends in the form of rental services. If prices are way above fundamental value, then we would expect price growth to increase well above the simultaneous pace of rent. Following Baker and Wurgler (2006), I orthogonalize each proxy on a set of economic variables (rents, population, income, unemployment, employment, and mortgage interest rates) to remove the potential influence of local housing market fundamentals. I then extract the first principal component from the remaining residuals from each of these regressions. Following the rationale behind the Baker and Wurgler (2006) index, the first principal component then represents the common sentiment component across the five sentiment proxies and removes any remaining idiosyncrasies. The indices are highly correlated, at a lag of approximately 3 quarters. Since all of the proxies in the multifactor index are transactional outcomes in response to rising sentiment (e.g., buying or flipping a house or receiving a loan in response to positive sentiment), I find this consistent with the media capturing the positive or negative tone of sentiment in real-time and/or influencing subsequent expectations. The home-buying process includes several steps, from searching for mortgage lenders, qualifying for a loan, to initiating the search for a home. The actual search process for a home itself can also take several months from first search to an accepted offer. Thus, we would expect the multifactor index to lag somewhat to the news sentiment index. The Online Appendix provides detailed correlations and figures on how the multifactor indices line up with the media sentiment indices by city. One limitation of the multifactor index is the data for the five sentiment proxies require a proprietary transactions deeds records database for which data are only available for 20 out of 34 cities in my sample.6 Nonetheless, for those cities that data are available, the multifactor index lines up well with the media housing sentiment index. This correlation suggests that the media sentiment index not only captures sentiment from home buyers, but among agents across the housing market in general. Where data are available, the multifactor index also suggests to be a potential alternative index to both validate and/or provide an additional proxy for sentiment. 2. House Prices and Sentiment 2.1 Descriptive patterns I obtain quarterly residential home prices across my sample of 34 cities from the Federal Housing Finance Agency (FHFA). Like the Case-Shiller home price index, the FHFA home price index estimates price changes for single-family homes with repeat sales to control for the changing quality of houses being sold through time. Their estimates are based on data on repeat mortgage transactions that have been purchased or securitized by Fannie Mae or Freddie Mac. The FHFA indices cover additional cities in my sample beyond the twenty major cities followed by Standard & Poor’s/Case-Shiller index. Both indexes, however, are highly correlated (0.87). Figure 3 plots media sentiment and house prices for select cities in my sample. (For the full sample of 34 city plots, please see the Online Appendix.) Panel A provide plots for cities with high price appreciation from 2000 to 2006. The striking boom and bust pattern in Figure 1 is driven by cities such as Los Angeles and Las Vegas, which experienced a similar lead-lag trend in house sentiment and prices. Panel B shows that cities with minimal price appreciation such as Atlanta and Cleveland, however, appear to have more random or subdued patterns in media sentiment from 2000 to 2004. While these cities did not observe large run-ups in prices, they did experience large declines in prices that seem to be preceded by patterns of decline in media sentiment as well. As Ferreira and Gyourko (2012) also document, I observe a wide variation in house price changes across cities in my sample during this period. Likewise, I find significant variation in the timing and magnitude of appreciation of sentiment index across cities as well. Figure 3 View largeDownload slide Housing sentiment and prices This figure plots the housing sentiment and price indexes and lists the respective mean and standard deviation by city from 2000 to 2013 for select cities. Panel A provides plots for cities with high price appreciation from 2000 to 2006, and panel B shows plots for cities with relatively lower price appreciation. The solid blue line represents the media sentiment index, and the dashed red line plots the price index. Plots for the full sample of 34 cities are available in an Online Appendix. Figure 3 View largeDownload slide Housing sentiment and prices This figure plots the housing sentiment and price indexes and lists the respective mean and standard deviation by city from 2000 to 2013 for select cities. Panel A provides plots for cities with high price appreciation from 2000 to 2006, and panel B shows plots for cities with relatively lower price appreciation. The solid blue line represents the media sentiment index, and the dashed red line plots the price index. Plots for the full sample of 34 cities are available in an Online Appendix. The lead time between media sentiment and prices seems relatively large initially, particularly in comparison to the stock market where sentiment predicts prices and their reversals over just several days (Tetlock 2007, Baker and Wurgler 2006). Taken together with the timing between my media sentiment index and the multifactor index, however, this lead-lag pattern seems consistent with the transaction process and frictions in the housing market. On average, my media sentiment index peaks in mid-2004, the multifactor index peaks in mid-2005, and house prices peak on average peaks in mid-2006. Both the SOC and Case-Shiller surveys also find that expectations started declining in 2004, so the pattern in the media index appears to be consistent with survey expectations. As discussed in Section 1.4, all of the proxies in the multifactor index are transactional outcomes that would result in response to rising expectations, so we would expect sentiment to rise and fall first and then see actions such as closing on a mortgage or investing in a second home to follow two or three quarters later. House price indices are then based on publicly recorded transactions registered with the county deeds records, which are not recorded until after a transaction has finally closed. Depending on details of home inspection and financing, from first offer to final closing on a home can often take at least 1 month. The shortest average reported mortgage loan closing period reported by Ellie Mae, a technology company that tracks mortgage applications, is 37 days.7 Considering these frictions and that house price indices are constructed at a 3-month moving average, it seems reasonable that it would take another two or three quarters for these shifts in housing transactions underlying the multifactor index to show up in changes in house price index levels. 2.2 Predicting prices with sentiment Thus, the leading patterns across cities suggests the media housing sentiment has a predictive impact on prices. I explore the effect of my media measure of expectations on house prices with the following linear framework: $$\triangle p_{it+1}=\alpha+\lambda L^{k}\triangle p_{it}+\beta L^{k}\Delta s_{it}+\gamma\triangle x_{it}+\epsilon_{it+1}\label{eq:2}$$ (2) where $$i$$ denotes each city and t indicates a quarterly period. A lowercase letter represents a log operator such that ($$p_{t}=logP_{t}$$) and $$\triangle$$ denotes the first difference such that $$\Delta p_{t+1}=logP_{t+1}-logP_{t}$$. I include the same number of lags of past price changes as sentiment lags in all specifications, as denoted by $$L^{k}\triangle p_{it}$$. Predictability in house price changes has been well documented across a number of studies. Among the most well-known, Case and Shiller (1989) find significant positive serial correlation and predictability with past price growth in four markets. FHFA house price changes across my sample of 34 cities exhibit positive serial correlation with an average AR(1) coefficient equal to approximately 0.42. Vector $$\triangle x_{t}$$ includes a set of economic variables such as rents, population, income, employment, and interest rates that have been shown to predict residential house price growth over time. The “fundamental value” of an asset typically refers to its present discounted value of future cash flow, which models of housing assume housing pays dividends in the form of rental services. I obtain city-level measures of rents from REIS.com, which provides average asking rents on rental units with common characteristics with single family homes. A number of housing studies also highlight the importance of labor market variables on housing demand (Roback 1982, Rosen 1979). I attain local employment and levels and unemployment rates by city from the Bureau of Labor Statistics (BLS). The Rosen/Roback model of spatial equilibrium also suggests income as an important demand shifter. I include city measures of personal income per capita and population from the Bureau of Economic Analysis (BEA). Studies argue that low interest rates should lead to increased housing demand and higher prices (Himmelberg, Mayer, and Sinai 2005). I include the national 30-year conventional mortgage rate relevant to most home buyers, but also compute real interest rates following Himmelberg, Mayer, and Sinai (2005) by subtracting the Livingston Survey 10-year expected inflation rate from the 10-year Treasury bond rate for robustness. The Online Appendix provides the summary statistics for all variables. This set of economic “fundamentals” does an exceptional job explaining the changes in house prices prior to 2000, but has difficulty explaining the variation in the most recent cycle. The vector $$\triangle x_{t}$$ predicts nearly 70% of variation in composite house prices prior to 2000. Using the same set of economic variables to explain house price growth after 2000, however, explains less than 10% of the variation. Local economic variables explain more of the variation in prices city to city post-2000 than in the composite, but are able to explain 1.55 times more variation prior to 2000. These traditional housing factors are able to explain more variation in prices in cities that had stable house price appreciation, but account for minimal movement in cities with large swings in prices post-2000. Thus, Equation (2) considers whether a media proxy for expectations serves as a significant predictor for house prices during this period. I first normalize my index to be positive with the same adjustment as the SOC survey measure (i.e., $$pos-neg+100$$). I then orthogonalize my measure of sentiment changes, $$\triangle s_{t}$$, from the observed vector of fundamentals, $$\triangle x_{t}$$. Thus, $$\triangle s_{t}$$ represents log differences in my measure of housing media sentiment, orthogonalized to observed fundamentals, and $$L^{k}$$ is a lag operator that indicates $$k$$ number of lags such that lags $$L^{k}\Delta s_{n}t=lnS_{n,t-k}-lnS_{n,t-k-1}$$. I impose a finite-distributed lag structure with four quarter periods such that $$k=4$$.8 The parameter $$\beta$$ then captures the total accumulated predictive effect of sentiment on house prices for each individual lag $$k$$ of $$L^{k}\triangle s_{t}$$. Equation (2) tests the alternative hypothesis that $$\beta\neq0$$ against the null that $$\beta=0$$. If media sentiment simply reflects price movements or economic fundamentals already incorporated into prices, then $$\beta$$ should not be significantly different than zero. Table 3 presents the results from Equation (2). The first row reports the total accumulated effect of housing sentiment, $$\beta$$, on the current $$t$$ quarterly growth in prices. The subsequent rows breaks down the lagged effect of sentiment by each quarter. Estimates show that a 1% appreciation in four quarters of accumulated lagged sentiment is positively associated with a future quarterly price appreciation of approximately 0.93 percentage points. These coefficient estimates represent the predicted effect of housing media sentiment above and beyond both historical housing economic variables and four quarters of past prices changes. To put magnitudes into context, a one standard deviation increase in a one year accumulation of housing sentiment in Las Vegas predicts approximately 12% future quarterly price growth. The largest quarterly price growth during in my sample occurred in Las Vegas in Q2 of 2004, appreciating almost 12.5%. Table 3 Predicting house price growth with housing sentiment Dep var: House price growth, $$t$$=quarterly (1) (2) (3) (4) Sum of lagged sentiment 0.933 0.933 0.933 0.670 (7.003) (3.064) (3.643) (3.319) $$\quad$$Qtr 1 $$(\triangle s_{t-1})$$ 0.301 0.301 0.301 0.211 (6.905) (3.503) (3.792) (3.722) $$\quad$$Qtr 2 $$(\triangle s_{t-2})$$ 0.239 0.239 0.239 0.170 (5.189) (2.823) (2.947) (2.359) $$\quad$$Qtr 3 $$(\triangle s_{t-3})$$ 0.232 0.232 0.232 0.158 (4.744) (2.284) (2.833) (2.227) $$\quad$$Qtr 4 $$(\triangle s_{t-4})$$ 0.162 0.162 0.162 0.129 (4.120) (2.703) (2.934) (2.720) Lagged price changes $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged national sentiment . . . $$\checkmark$$ SE: Newey-West $$\checkmark$$ . . . SE: Driscoll-Kray . $$\checkmark$$ . . SE: double clustered by $$(i,t)$$ . . $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.38 0.38 0.38 0.44 Dep var: House price growth, $$t$$=quarterly (1) (2) (3) (4) Sum of lagged sentiment 0.933 0.933 0.933 0.670 (7.003) (3.064) (3.643) (3.319) $$\quad$$Qtr 1 $$(\triangle s_{t-1})$$ 0.301 0.301 0.301 0.211 (6.905) (3.503) (3.792) (3.722) $$\quad$$Qtr 2 $$(\triangle s_{t-2})$$ 0.239 0.239 0.239 0.170 (5.189) (2.823) (2.947) (2.359) $$\quad$$Qtr 3 $$(\triangle s_{t-3})$$ 0.232 0.232 0.232 0.158 (4.744) (2.284) (2.833) (2.227) $$\quad$$Qtr 4 $$(\triangle s_{t-4})$$ 0.162 0.162 0.162 0.129 (4.120) (2.703) (2.934) (2.720) Lagged price changes $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged national sentiment . . . $$\checkmark$$ SE: Newey-West $$\checkmark$$ . . . SE: Driscoll-Kray . $$\checkmark$$ . . SE: double clustered by $$(i,t)$$ . . $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.38 0.38 0.38 0.44 Coefficient estimates from predictive regressions of sentiment on house prices specified in Equation (2). The first row reports the accumulated impact of a lagged year of sentiment growth on future quarterly price growth. The subsequent rows breaks down this summed impact by each quarter lag. All columns control for four quarters of past price changes and a vector of housing fundamentals, $$\triangle x_{t-1}$$, described in the text. The corresponding $$t$$-statistics of each estimate are reported in parentheses based on respected calculated standard errors. (Note the first row reports $$t$$-statistics based standard errors adjusted appropriately for the summed estimates of sentiment lags.) Column 4 controls for an additional 4 quarter lags of national housing sentiment measured by the Michigan Survey of Consumers. Full regression results with all point estimates can be found in an Online Appendix. Table 3 Predicting house price growth with housing sentiment Dep var: House price growth, $$t$$=quarterly (1) (2) (3) (4) Sum of lagged sentiment 0.933 0.933 0.933 0.670 (7.003) (3.064) (3.643) (3.319) $$\quad$$Qtr 1 $$(\triangle s_{t-1})$$ 0.301 0.301 0.301 0.211 (6.905) (3.503) (3.792) (3.722) $$\quad$$Qtr 2 $$(\triangle s_{t-2})$$ 0.239 0.239 0.239 0.170 (5.189) (2.823) (2.947) (2.359) $$\quad$$Qtr 3 $$(\triangle s_{t-3})$$ 0.232 0.232 0.232 0.158 (4.744) (2.284) (2.833) (2.227) $$\quad$$Qtr 4 $$(\triangle s_{t-4})$$ 0.162 0.162 0.162 0.129 (4.120) (2.703) (2.934) (2.720) Lagged price changes $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged national sentiment . . . $$\checkmark$$ SE: Newey-West $$\checkmark$$ . . . SE: Driscoll-Kray . $$\checkmark$$ . . SE: double clustered by $$(i,t)$$ . . $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.38 0.38 0.38 0.44 Dep var: House price growth, $$t$$=quarterly (1) (2) (3) (4) Sum of lagged sentiment 0.933 0.933 0.933 0.670 (7.003) (3.064) (3.643) (3.319) $$\quad$$Qtr 1 $$(\triangle s_{t-1})$$ 0.301 0.301 0.301 0.211 (6.905) (3.503) (3.792) (3.722) $$\quad$$Qtr 2 $$(\triangle s_{t-2})$$ 0.239 0.239 0.239 0.170 (5.189) (2.823) (2.947) (2.359) $$\quad$$Qtr 3 $$(\triangle s_{t-3})$$ 0.232 0.232 0.232 0.158 (4.744) (2.284) (2.833) (2.227) $$\quad$$Qtr 4 $$(\triangle s_{t-4})$$ 0.162 0.162 0.162 0.129 (4.120) (2.703) (2.934) (2.720) Lagged price changes $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged national sentiment . . . $$\checkmark$$ SE: Newey-West $$\checkmark$$ . . . SE: Driscoll-Kray . $$\checkmark$$ . . SE: double clustered by $$(i,t)$$ . . $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.38 0.38 0.38 0.44 Coefficient estimates from predictive regressions of sentiment on house prices specified in Equation (2). The first row reports the accumulated impact of a lagged year of sentiment growth on future quarterly price growth. The subsequent rows breaks down this summed impact by each quarter lag. All columns control for four quarters of past price changes and a vector of housing fundamentals, $$\triangle x_{t-1}$$, described in the text. The corresponding $$t$$-statistics of each estimate are reported in parentheses based on respected calculated standard errors. (Note the first row reports $$t$$-statistics based standard errors adjusted appropriately for the summed estimates of sentiment lags.) Column 4 controls for an additional 4 quarter lags of national housing sentiment measured by the Michigan Survey of Consumers. Full regression results with all point estimates can be found in an Online Appendix. Columns 1 through 3 of Table 3 compares t-statistics calculated based on 3 different standard error procedures to test the stability of coefficient significance. In Column 1, I assume the error term $$\epsilon_{t+1}$$ is heteroscedastic across time and serially correlated within city, and calculate panel Newey and West (1987) standard errors that are robust to heteroscedasticity and auto-correlation up to twelve lags. This column assumes errors are correlated within city since studies have documented little mobility in homeowners across states. Nonetheless, the presence of spatial correlation across my measures could severely understate calculated standard errors (Foote 2007). To address potential cross-sectional spatial dependence, I calculate Driscoll and Kraay (1998) standard errors for robustness. I find this does increase my standard errors as indicated by the lower t-statistics reported in parentheses, suggesting some dependence exists across cities. Nonetheless, my estimates remain statistically significant. In Column 3, I allow for further flexibility in the structure of this dependence and cluster my standard errors by both time and city. Doing so results in slightly higher t-statistics, and estimates remain statistically significant. The first three columns of Table 3 show that past local media sentiment has a positive and significant association with future quarterly house price growth above and beyond past local returns and local economic fundamentals. While house price cycles have been predominantly local in the past, the most recent cycle was marked by an unprecedented number of geographically separate cities that appeared to experience boom and bust trends at similar times, suggesting a potential national component across markets. Thus in addition to local media sentiment, Column 4 includes the SOC Housing Survey as a proxy for nationwide housing sentiment. The SOC Housing Survey does have significant predictive power for local returns as well, though at a smaller magnitude to the local index.9 Including national sentiment drops the magnitude of the local media sentiment index to 0.67, suggesting that a national systematic component accounts for approximately 30% of the local sentiment index. Thus while there is a strong local relationship between returns and sentiment, there is a notable systematic component across local housing sentiment as well. These results are consistent with the puzzle of simultaneous common yet heterogeneous movement observed across local housing markets during the last housing cycle. While an unprecedented number of local housing markets appeared to boom and bust at similar times, there were still many cities that had markets that did not boom at all, and yet looked similar on fundamentals to those that did. This puzzle points to why it is difficult for a national factor, such as the mortgage rate, to explain the breadth of price differences across local housing markets. Glaeser, Gottlieb, and Gyourko (2010), for example, found that mortgage rates and available credit can explain up to one-fifth of price appreciation across cities. Nonetheless, these results suggest that there is evidence of a systematic factor that plays a significant role across sentiment and prices. The following section further explores potential common determinants in sentiment across cities. 2.3 Determinants of sentiment The results thus far show that expectations, as proxied by the media sentiment, are positively associated with future house prices. At the same time, expectations may also be influenced by past price changes. In their first survey of home buyer expectations, Case and Shiller (1988) concluded “people seem to form their expectations from past price movements rather than knowledge of fundamentals.” In their updated surveys, Case, Shiller, and Thompson (2012) find that home buyers’ expectations appeared extremely high, projecting an appreciation of more than 10% per year for the next 10 years. While these expectations at first seem wildly unrealistic, Case, Shiller, and Thompson (2012) note that the Case-Shiller composite-10 index had indeed appreciated nearly 11% per year over the last ten years from 1996 to 2006. Greenwood and Shleifer (2014) examine six different surveys of investor expectations on the stock market, and find that investor expectations do appear to extrapolate from past stock market returns. I explore the nature of my measure of housing expectations with the following analogous linear specification in their study: $$S_{it}=\alpha+\lambda R_{it-k}+\delta P_{it}+\gamma logX_{it}+city_{i}+u_{it}\label{eq:3}$$ (3) where $$S_{it}$$ denotes the level of housing expectations in city $$i$$ at quarter $$t$$, and $$R_{it-k}$$ represents the cumulative return in city $$i$$’s local housing market from period $$k$$ to $$t$$. I control for both price-to-rent and price-to-income ratios, as denoted by $$P_{it}$$ and $$X_{it}$$ is the same vector of economic variables from Equation (2). I report results based on standard errors double clustered by time and city. Table 4 reports the estimates from Equation (3) for four different time horizons. Columns 1 through 4 regress the housing media sentiment index on 6-month, 1-year, 3-year, and 5-year cumulative price growth, respectively. The results show that past price appreciation predicts higher media sentiment, consistent with home buyer expectations in Case and Shiller (1988) and Case, Shiller, and Thompson (2012) survey responses. Column 1 shows that the coefficient on a 6-month lagged cumulative return is equal to approximately 7.3. This implies that if returns to housing in the last 6 months increase by one standard deviation (approximately 5.5% across my sample period), then the housing media sentiment index will increase by approximately 0.5 units. Given that the media sentiment index ranges from approximately $$-3.5$$ to 10.6, the impact of 0.5 units is sizeable and significant in magnitude. Table 4 Predicting housing sentiment with past returns Dep var: Media sentiment $$S_{it}$$, $$t$$=quarterly 6 months 1 year 3 years 5 years (1) (2) (3) (4) Cumulative price appreciation 7.310 5.362 1.404 –0.331 (3.85) (4.25) (1.75) (–0.53) Price/rent 10.956 9.204 8.057 14.633 (2.96) (2.55) (1.63) (2.15) 30-year mortgage rate –1.123 –1.029 –1.309 –1.472 (–5.25) (–5.06) (–6.17) (–8.07) City FEs $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,852 1,852 1,852 1,852 Adjusted $$R^{2}$$ 0.312 0.319 0.292 0.285 Dep var: Media sentiment $$S_{it}$$, $$t$$=quarterly 6 months 1 year 3 years 5 years (1) (2) (3) (4) Cumulative price appreciation 7.310 5.362 1.404 –0.331 (3.85) (4.25) (1.75) (–0.53) Price/rent 10.956 9.204 8.057 14.633 (2.96) (2.55) (1.63) (2.15) 30-year mortgage rate –1.123 –1.029 –1.309 –1.472 (–5.25) (–5.06) (–6.17) (–8.07) City FEs $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,852 1,852 1,852 1,852 Adjusted $$R^{2}$$ 0.312 0.319 0.292 0.285 Coefficient estimates from Equation (3), regressing sentiment levels on past cumulative returns, price-to-rent, price-to-income, and all fundamentals. The table only reports price-to-rent and mortgage rate controls, all other fundamentals are insignificant and of negligible magnitude. All inference is based on double-clustered standard errors by quarter and city, and corresponding $$t$$-statistics are reported in parentheses. Estimates show that when past returns increase, housing sentiment also rises. This effect is robust up to a window of 3 years, but the magnitude of the effect appears to decline with a longer horizon of 5 years. Table 4 Predicting housing sentiment with past returns Dep var: Media sentiment $$S_{it}$$, $$t$$=quarterly 6 months 1 year 3 years 5 years (1) (2) (3) (4) Cumulative price appreciation 7.310 5.362 1.404 –0.331 (3.85) (4.25) (1.75) (–0.53) Price/rent 10.956 9.204 8.057 14.633 (2.96) (2.55) (1.63) (2.15) 30-year mortgage rate –1.123 –1.029 –1.309 –1.472 (–5.25) (–5.06) (–6.17) (–8.07) City FEs $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,852 1,852 1,852 1,852 Adjusted $$R^{2}$$ 0.312 0.319 0.292 0.285 Dep var: Media sentiment $$S_{it}$$, $$t$$=quarterly 6 months 1 year 3 years 5 years (1) (2) (3) (4) Cumulative price appreciation 7.310 5.362 1.404 –0.331 (3.85) (4.25) (1.75) (–0.53) Price/rent 10.956 9.204 8.057 14.633 (2.96) (2.55) (1.63) (2.15) 30-year mortgage rate –1.123 –1.029 –1.309 –1.472 (–5.25) (–5.06) (–6.17) (–8.07) City FEs $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,852 1,852 1,852 1,852 Adjusted $$R^{2}$$ 0.312 0.319 0.292 0.285 Coefficient estimates from Equation (3), regressing sentiment levels on past cumulative returns, price-to-rent, price-to-income, and all fundamentals. The table only reports price-to-rent and mortgage rate controls, all other fundamentals are insignificant and of negligible magnitude. All inference is based on double-clustered standard errors by quarter and city, and corresponding $$t$$-statistics are reported in parentheses. Estimates show that when past returns increase, housing sentiment also rises. This effect is robust up to a window of 3 years, but the magnitude of the effect appears to decline with a longer horizon of 5 years. The most recent returns appear to have the greatest impact, though past returns have a positive and significant impact across all given columns. The magnitude of the estimated effects of lagged returns decline with further distance, where the coefficient on the one-year cumulative returns is equal to 5.3 down to 1.4 for 3 year returns. Extending the window of returns to just 5 years in Equation (3), results in estimates that are negative and not statistically significant. The effect of lagged returns on media sentiment only holds for the most recent returns. These findings suggest that housing expectations, as proxied by the media, possess a backward-looking nature consistent with theories of extrapolative expectations in behavioral finance (Barberis, Shleifer, and Vishny 1998, Campbell and Kyle 1993, Delong et al. 1990b). These results are also consistent with the structure of investor expectations Greenwood and Shleifer (2014) find in the stock market. Price to rent ratios are positively correlated with sentiment levels, but loses significance and magnitude beyond a window of 1 year. Further including price to income ratios has a null and insignificant effect. Note that all columns also include city fixed effects. Without city fixed effects, the predictive effect of past returns on sentiment is slightly higher, suggesting the effect of returns on sentiment not only comes within city variation but also differences in sentiment levels city to city. I include all housing fundamentals from Equation (2), but only the housing mortgage rate has any effect of notable significance. Table 4 reports the coefficient estimates on the 30-year mortgage rate, and shows that the mortgage rate is negatively associated with media sentiment. As the mortgage rate declines, housing media sentiment increases. This is consistent with responses in the SOC that answered optimistically about housing because mortgage rates were low. As Piazzesi and Schneider (2009) point out, interest rates have historically been a major driver of housing perception and these estimates suggest that mortgage rates continued to have an impact on public perception during this sample period as well. Nominal mortgage interest rates reached its lowest point in mid-2004, coinciding with the peak of many of the local sentiment indices. Real interest rates, however, actually hit the lowest point later in 2005. This common timing of the sentiment index with nominal, rather than real, interest rates is consistent with the behavioral bias of money illusion, that is, the tendency of individuals to only think in nominal terms. The results in Table 4 suggest that (nominal) mortgage interest rates are an important common determinant in housing media sentiment across cities, consistent with this money illusion bias. 3. Interpretation The results in Section 2 show that housing sentiment, as captured by the media, is positively associated with future house price growth. This predictive effect is robust to known fundamental controls that have explained house prices well historically. The structure of the housing sentiment index appears to be extrapolative in nature, and peaks more than two-and-a-half years ahead of house prices on average. One interpretation of these results is that they capture the effects of sentiment, in which investor beliefs were over-optimistic and drove house prices away from fundamentals. The media sentiment index certainly does mirror patterns in the Case, Shiller, and Thompson (2012) survey, where home buyer expectations look unjustifiably high and similarly peak in 2004. At the same time, however, these results could also be consistent with a story of unobserved fundamentals that were instead driving price growth. Housing markets are inherently local, and local media in particular could convey information on local fundamentals that are otherwise difficult to observe or collect. Another possibility then is that the media sentiment index represents a valuable source of unobserved information about the local housing market. Reviewing the patterns in Figure 2, particularly in markets with a defined boom and bust of prices, one could argue that the media proxy even looks like a perfect forecast, perhaps an indication expectations reflected justified local information. The goal in this section is to provide a set of tests to explore these two interpretations empirically. The advantage of measuring housing sentiment with the news media is that we can exploit the richness of the data both across text and cross-sectional city indices. We can then use this analysis to test results that we expect to be consistent with an interpretation of sentiment versus a story of information on unobserved fundamentals. 3.1 Testing the news content over fundamentals I first address the interpretation of the media sentiment index as a source of unmeasured fundamentals by examining the informational content of the articles firsthand. By keeping track of all of the text across articles, I can directly examine whether my media sentiment index is driven by articles discussing particular housing variables. Following an analogous strategy from Tetlock, Saar-Tsechansky, and Macskassy (2008), I flag any news article that mentions stem words such as “rents,” “population,” or “taxes” that may indicate discussions of local housing market information. I then quantify the fraction of positive and negative words across these articles to create a set of “media fundamentals” analogous to my overall housing media sentiment index. For example, in Table 4, “media rents” is an index of the positive and negative words across all local articles that discuss rents, “media user costs” refers to words across any articles that discuss factors enter into the user cost of housing such as property taxes and maintenance costs, and “media demographics” indicate any articles that discuss local population and income. Through this strategy, the news media index also presents a potential opportunity to quantify particular information in markets where fundamentals are difficult to observe. I then additionally control for these media fundamentals in Equation (2) to see if any of these measures reduce my estimated effect of expectations. If the discussion of fundamentals from these articles drive the patterns in my main housing media sentiment index, then controlling for words in these articles should drive down the significance and magnitude of the results in Table 3. I control for two quarters of lags of each media fundamental as well as all observed fundamental controls from Equation (2). Table 5 shows that the estimated effects of media sentiment on price growth remain robust to controlling for news content over fundamentals. Columns 1 through 7 add each media control sequentially to test the stability of the coefficient estimate for the overall sentiment index, $$\beta.$$ Coefficient estimates of $$\beta$$ remain remarkably robust with the inclusion of each additional control and decline neither in significance nor magnitude. As argued by a number of previous studies, the stability of estimates to the sequential addition of controls suggests bias from unobserved factors is less likely (Altonji et al. 2005, Angrist and Krueger 1999). Table 5 Is the housing sentiment index driven by news stories on fundamentals? Dep var: Housing price growth, $$t$$=quarterly (1) (2) (3) (4) (5) (6) (7) Sum of lagged sentiment 0.929 0.939 0.940 0.943 0.924 0.891 0.836 (3.636) (3.735) (3.715) (3.711) (3.440) (3.283) (3.061) Media rents $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media labor market conditions . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media housing supply . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media user costs . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media demographics . . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media local GDP & inflation . . . . . $$\checkmark$$ $$\checkmark$$ Media credit conditions . . . . . . $$\checkmark$$ Lagged prices $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.379 0.380 0.380 0.381 0.383 0.385 0.387 Dep var: Housing price growth, $$t$$=quarterly (1) (2) (3) (4) (5) (6) (7) Sum of lagged sentiment 0.929 0.939 0.940 0.943 0.924 0.891 0.836 (3.636) (3.735) (3.715) (3.711) (3.440) (3.283) (3.061) Media rents $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media labor market conditions . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media housing supply . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media user costs . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media demographics . . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media local GDP & inflation . . . . . $$\checkmark$$ $$\checkmark$$ Media credit conditions . . . . . . $$\checkmark$$ Lagged prices $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.379 0.380 0.380 0.381 0.383 0.385 0.387 Coefficient estimates for sentiment from Equation (2), with additional controls for news content over fundamentals. Each media fundamental listed identifes any news article that mentions a particular fundamental in its text. The variable “media rents”, for example, is the share of positive minus negative words in any real estate articles that mention any word related to “rents” in its text. Standard errors are double-clustered by quarter and city, and corresponding $$t$$-statistics are reported in parentheses. Estimates of lagged media represent the impact of a 1 $$\%$$ increase in the accumulated past year growth of sentiment on the quarterly growth in house prices. Estimates remain robust to the inclusion of “media fundamentals,” suggesting that the estimated effect is not driven by a particular set of stories on fundamentals. Table 5 Is the housing sentiment index driven by news stories on fundamentals? Dep var: Housing price growth, $$t$$=quarterly (1) (2) (3) (4) (5) (6) (7) Sum of lagged sentiment 0.929 0.939 0.940 0.943 0.924 0.891 0.836 (3.636) (3.735) (3.715) (3.711) (3.440) (3.283) (3.061) Media rents $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media labor market conditions . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media housing supply . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media user costs . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media demographics . . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media local GDP & inflation . . . . . $$\checkmark$$ $$\checkmark$$ Media credit conditions . . . . . . $$\checkmark$$ Lagged prices $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.379 0.380 0.380 0.381 0.383 0.385 0.387 Dep var: Housing price growth, $$t$$=quarterly (1) (2) (3) (4) (5) (6) (7) Sum of lagged sentiment 0.929 0.939 0.940 0.943 0.924 0.891 0.836 (3.636) (3.735) (3.715) (3.711) (3.440) (3.283) (3.061) Media rents $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media labor market conditions . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media housing supply . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media user costs . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media demographics . . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media local GDP & inflation . . . . . $$\checkmark$$ $$\checkmark$$ Media credit conditions . . . . . . $$\checkmark$$ Lagged prices $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.379 0.380 0.380 0.381 0.383 0.385 0.387 Coefficient estimates for sentiment from Equation (2), with additional controls for news content over fundamentals. Each media fundamental listed identifes any news article that mentions a particular fundamental in its text. The variable “media rents”, for example, is the share of positive minus negative words in any real estate articles that mention any word related to “rents” in its text. Standard errors are double-clustered by quarter and city, and corresponding $$t$$-statistics are reported in parentheses. Estimates of lagged media represent the impact of a 1 $$\%$$ increase in the accumulated past year growth of sentiment on the quarterly growth in house prices. Estimates remain robust to the inclusion of “media fundamentals,” suggesting that the estimated effect is not driven by a particular set of stories on fundamentals. The estimated effect of housing media sentiment remains significant, positive, and large in magnitude after accounting for an extensive set of fundamental articles. The only media fundamental to reduce the magnitude slightly is the fraction of positive and negative words across articles discussing credit conditions. While not a typical housing fundamental historically, an additional new factor largely debated during the housing crisis was the availability of easy credit. Mian and Sufi (2011) show that the extraordinary rise in house prices from 2000 to 2005 was also accompanied by an unprecedented expansion of mortgage credit, particularly in the subprime market (Mian and Sufi 2009, Glaeser, Gottlieb, and Gyourko 2010). Easing lending standards and rising approval rates opened home-buying to a new set of consumers, which potentially allowed a new group of homebuyers to shift aggregate demand and drive up house price growth (Keys, Seru, and Vig 2012, Mian and Sufi 2011). Including additional controls for subprime lending and easy credit, however, does not diminish the predictive effect of the media index on prices.10 3.2 Cross-sectional effects of housing media sentiment The preceding sections take advantage of the textual nature of the data, which allows for a direct analysis of the content across articles. Measuring sentiment with news media further provides an avenue to create indices across several markets in real time, creating an opportunity to explore sentiment effects cross-sectionally. Given the city-wide variation in house price changes during this period, this is particularly useful when exploring the relationship between sentiment and prices in the housing market. With access to city-specific indices, we can explore whether prices and sentiment vary systematically according to any city-level traits. The cross-sectional nature of the data thus allows us to test whether cross-sectional differences exist based on characteristics that we would expect to be consistent with a theory of sentiment. 3.2.1 Minority home buyers Baker and Wurgler (2006) highlight two channels through which theory predicts sentiment has cross-sectional effects on prices: (1) where demand is less informed and (2) where arbitrage constraints are particularly binding. Since arbitrage constraints are completely binding in the housing market, this suggests potential cross-sectional effects lie in differences in informed demand for housing among buyers. For example, Shiller (2008) raises concerns to the disproportionate lack of access to adequate financial advice available to minority groups that may lead to investment decisions based on minimal or biased information. Indeed in a comprehensive survey of financial literacy, Lusardi and Mitchell (2007) find that financial knowledge and planning are at lowest levels among Hispanic and Black respondents. The Home Mortgage Disclosure Act (HMDA) requires lending institutions to file reports on all mortgage applications, and thus provides an opportunity to test cross-sectional effects based on the demographic profile of the pool of potential home buyers. Following Ferreira and Gyourko (2012), I define a “% minority” variable based on the fraction of African-Americans and Hispanics loan applicants coded by the HMDA data set. I calculate the average fraction of minority loan applicants across the sample period for each city, and then divide the 34 cities in my sample into two equal groups based on whether the city contains a low or high fraction of potential minority homebuyers. I then estimate the following equation to test whether sentiment effects differ across group: \begin{align} \triangle p_{it+1}&{\ =\ }\alpha+\lambda L^{k}\triangle p_{it}+\beta L^{k}\Delta s_{it}+High+\beta_{High}L^{k}\Delta s_{it}\times High\nonumber\\ &\quad +\,\gamma\triangle x_{it}+\epsilon_{it+1} \label{eq:5} \end{align} (4) Equation (4) examines the same predictive relationship between prices and media sentiment like in Equation (2), but now explores additional interactions between the fraction of potential minority homebuyers and lagged sentiment. $$High$$ is an indicator variable for a city with a high fraction of minority buyers, and is interacted with all included lags of media sentiment. Thus, $$\beta$$ now represents the baseline effect of sentiment for cities with the lowest fraction of minority homebuyers. The parameter $$\beta_{High}$$ then captures the additional sentiment effect of being in the high group. If there are no cross-sectional differences across cities, the coefficient $$\beta_{High}$$ should not be significantly different than zero. If we presume the pool of buyers with low access to financial advice to be more likely subject to sentiment, then we would expect sentiment effects to be larger in cities with a larger fraction of minority buyers. In other words, a significant coefficient $$\beta_{High}>0$$ would indicate a cross-sectional effect consistent with a theory of sentiment. Column 1 of Table 6 reports the coefficient estimates $$\beta$$ in the first row and $$\beta_{High}$$ in the row directly below. The results reveal that sentiment effects do appear to be greater in cities with a higher fraction of minority loan applicants. Baseline estimates for the low group indicate a 1% appreciation in accumulated lagged sentiment predicts a future quarterly price appreciation of approximately 0.6 percentage points. This is still significant in magnitude relative to the average quarterly house price appreciation (0.8). Estimates for $$\beta_{High}$$, however, suggest that a the same increase in sentiment would lead to a 1.3 percentage point additional increase in future price growth. Thus, this means that sentiment in cities with a larger fraction of minority mortgage applicants has an impact more than 2x greater than in those with fewer potential minority home buyers. In their analysis, Ferreira and Gyourko (2012) find that the share of minority home buyers are not able to explain the timing or magnitude of the house price booms across cities. These results suggest that while the fraction of minority purchasers may not be able to account for the changes in house prices alone, it may still have served as a factor in combination with changing sentiment during this period. Table 6 Cross-sectional effects of housing sentiment Low price homes High price homes (1) (2) (3) (4) (5) Sum of lagged media sentiment 0.582 0.578 0.850 1.394 0.806 (2.280) (1.928) (1.297) (4.903) (3.919) $$\quad$$$$\times$$$$High$$ minority buyers 1.315 . . . . (3.748) . . . . $$\quad$$$$\times$$$$High$$ 2nd home buyers . 1.848 . . . . (4.183) . . . $$\quad$$$$\times$$$$High$$ subprime loans . . 1.414 . . . . (2.176) . . Lagged prices ($$\triangle p_{t-1...t-4}$$) $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$(\triangle x_{it-1})$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,450 1,450 754 824 824 Adjusted $$R^{2}$$ 0.392 0.396 0.424 0.633 0.633 Low price homes High price homes (1) (2) (3) (4) (5) Sum of lagged media sentiment 0.582 0.578 0.850 1.394 0.806 (2.280) (1.928) (1.297) (4.903) (3.919) $$\quad$$$$\times$$$$High$$ minority buyers 1.315 . . . . (3.748) . . . . $$\quad$$$$\times$$$$High$$ 2nd home buyers . 1.848 . . . . (4.183) . . . $$\quad$$$$\times$$$$High$$ subprime loans . . 1.414 . . . . (2.176) . . Lagged prices ($$\triangle p_{t-1...t-4}$$) $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$(\triangle x_{it-1})$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,450 1,450 754 824 824 Adjusted $$R^{2}$$ 0.392 0.396 0.424 0.633 0.633 Columns 1-3 report the for $$\beta$$ and $$\beta^{High}$$ estimates in Equation (4), exploring cross-sectional effects of sentiment on house price changes. Each coefficient estimate measures the accumulated impact of a lagged year of sentiment growth on future quarterly price growth. The baseline estimate for $$\beta$$ represents the effect of the $$Low$$ group described in the text, and estimates $$\beta^{High}$$ represent the additional effect of sentiment on prices for the corresponding $$High$$ group. Columns 4 and 5 estimate the effect of sentiment from Equation (2) on low versus high-tier housing markets as measured by the Case-Shiller house price index. Note that data coverage for estimations in Columns 3-5 tracks fewer cities, and thus the number of observations in these columns are smaller. All columns control for past price changes and the same vector of housing fundamentals in all previous tables. All inference is based on double-clustered standard errors by quarter and city, and the corresponding $$t$$-statistics are reported in parentheses. Results show that estimated sentiment effects are greater across groups with investors that are likely more subject to housing sentiment. Table 6 Cross-sectional effects of housing sentiment Low price homes High price homes (1) (2) (3) (4) (5) Sum of lagged media sentiment 0.582 0.578 0.850 1.394 0.806 (2.280) (1.928) (1.297) (4.903) (3.919) $$\quad$$$$\times$$$$High$$ minority buyers 1.315 . . . . (3.748) . . . . $$\quad$$$$\times$$$$High$$ 2nd home buyers . 1.848 . . . . (4.183) . . . $$\quad$$$$\times$$$$High$$ subprime loans . . 1.414 . . . . (2.176) . . Lagged prices ($$\triangle p_{t-1...t-4}$$) $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$(\triangle x_{it-1})$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,450 1,450 754 824 824 Adjusted $$R^{2}$$ 0.392 0.396 0.424 0.633 0.633 Low price homes High price homes (1) (2) (3) (4) (5) Sum of lagged media sentiment 0.582 0.578 0.850 1.394 0.806 (2.280) (1.928) (1.297) (4.903) (3.919) $$\quad$$$$\times$$$$High$$ minority buyers 1.315 . . . . (3.748) . . . . $$\quad$$$$\times$$$$High$$ 2nd home buyers . 1.848 . . . . (4.183) . . . $$\quad$$$$\times$$$$High$$ subprime loans . . 1.414 . . . . (2.176) . . Lagged prices ($$\triangle p_{t-1...t-4}$$) $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$(\triangle x_{it-1})$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,450 1,450 754 824 824 Adjusted $$R^{2}$$ 0.392 0.396 0.424 0.633 0.633 Columns 1-3 report the for $$\beta$$ and $$\beta^{High}$$ estimates in Equation (4), exploring cross-sectional effects of sentiment on house price changes. Each coefficient estimate measures the accumulated impact of a lagged year of sentiment growth on future quarterly price growth. The baseline estimate for $$\beta$$ represents the effect of the $$Low$$ group described in the text, and estimates $$\beta^{High}$$ represent the additional effect of sentiment on prices for the corresponding $$High$$ group. Columns 4 and 5 estimate the effect of sentiment from Equation (2) on low versus high-tier housing markets as measured by the Case-Shiller house price index. Note that data coverage for estimations in Columns 3-5 tracks fewer cities, and thus the number of observations in these columns are smaller. All columns control for past price changes and the same vector of housing fundamentals in all previous tables. All inference is based on double-clustered standard errors by quarter and city, and the corresponding $$t$$-statistics are reported in parentheses. Results show that estimated sentiment effects are greater across groups with investors that are likely more subject to housing sentiment. 3.2.2 Speculators Piazzesi and Schneider (2009) analyze housing answers from the Michigan Survey of Consumers and identify a set of homebuyers as “momentum” traders, or those that invested in housing because of an increased optimistic belief in rising house prices. Indeed, Delong et al. (1990a) show that traders may rationally respond to growing sentiment in the market and buy in anticipation of increasing prices. Chinco and Mayer (2016) document the presence of second home buyers across local housing markets from 2000 to 2007, and find that nonlocal investors not only behave as speculators but also do not appear particularly well informed. Bayer, Gelssler, and Roberts (2011) find that speculative activity, particularly among naive investors, increased during the housing boom. Regardless of whether these investors were rationally chasing or subject to sentiment themselves, these studies present an additional opportunity to test cross-sectional effects of sentiment across markets. Given a measure of speculators, evidence suggests that we should observe a larger impact of sentiment in markets where there is greater speculation. Utilizing the transaction-level records in the DataQuick data set, I define the number of speculators in a city by those who purchased one or more homes that are not owner-occupied.11 I again divide the cities my sample into two equal groups, but now based on the presence of speculators across markets. $$Low$$ thus refers to markets with a low number of speculators and $$High$$ is an indicator variable for cities with an above median number of speculators. I then run the same regression in Equation (4), but test for cross-sectional differences based on the number of speculators across markets. Consequently, $$\beta$$ now represents the baseline effect of sentiment for cities with low or no speculators, and $$\beta_{High}$$ now represents the additional sentiment effect of having a high presence of speculators. As with the test across minority homebuyers, if there exist no cross-sectional differences across cities, $$\beta_{High}$$ should not be significantly different than zero. If $$\beta_{High}$$ is significantly greater than zero, then this would indicate that the impact of sentiment is greater in cities with greater speculative investors, consistent with a theory of sentiment. Column 2 of Table 6 reports the results of estimating Equation (4) based on the cross-sectional differences in speculators. The results show that sentiment still has a positive and significant predictive effect on prices in cities with fewer speculators. Coefficient estimates for the $$Low$$ speculators group again show that a 1% appreciation in accumulated lagged sentiment predicts a future quarterly price appreciation of approximately 0.6 percentage points, similar to that of the low minority group. Coefficient estimates for $$\beta_{High}$$, however, suggest that a 1% increase in sentiment would lead to an additional 1.9 percentage point increase in future price growth. The magnitude of this estimate reveals a greater difference between the low and high groups, nearly twice the size of the estimates for the high minority group reported in Column 1. Cities with a greater number of speculators observe a sentiment impact more than double that in those with fewer speculative investors. Thus, as with the share of minorities, estimates again reveal results consistent with a theory of sentiment. 3.2.3 Subprime loans While the trend in subprime lending does rise and fall with prices in some cities, I find that its pattern does not drive out the impact of sentiment on house prices. This result is consistent with studies such as Glaeser, Gottlieb, and Gyourko (2010) that find while leverage expanded significantly, credit variables are unable to explain the substantial rise in house prices. Ferreira and Gyourko (2012) suggest that while their results also cannot explain the timing of the housing boom with patterns of subprime lending, it may still have played an important role in the later perpetuation of the housing boom. One possibility is that subprime lending served as a factor in combination with changing sentiment during this period. Mian and Sufi (2009), for example, find evidence that the increasing supply of credit from lenders led to the expansion of leverage taken out by subprime borrowers. If subprime lenders did indeed target the profile of certain borrowers, they likely targeted those buyers that were most subject to sentiment. We would simultaneously expect those most uninformed and vulnerable to sentiment to be more willing to take out risky, subprime loans. If this is the case, then this suggests an additional cross-sectional test: we should observe a larger impact of sentiment on prices in markets where more subprime loans were taken out. I divide markets into equal $$Low$$ and $$High$$ groups now based on the average number of subprime loans made each quarter. Following Ferreira and Gyourko (2012), I consider a loan subprime if issued by any of the top twenty subprime lenders ranked by the publication Inside Mortgage Finance. I again estimate the same Equation (4), but now test for cross-sectional differences based on subprime lending across markets. Similar to the two previous tests, this estimation tests the alternative hypothesis $$\beta_{High}>0$$, indicating that sentiment effects on prices are greater in cities with greater subprime lending. Results in support of the null hypothesis, $$\beta_{High}=0$$, would instead indicates that there is no discernible cross-sectional difference across markets based on the frequency of subprime lending. Column 3 of Table 6 reports the cross-sectional results based on the differences across markets in subprime loans. Baseline estimates for sentiment in the $$Low$$ group are still positive and statistically significant, and just slightly lower in magnitude to baseline estimates of sentiment in Table 2. Estimates for $$\beta_{High}$$ in Column 3, however, indicate that effect of sentiment is even greater in magnitude across markets with more subprime lending. A 1% increase in sentiment has an additional 1.4 percentage point impact on price appreciation in cities with high subprime lending. Like in Column 2, these estimates show a distinct difference between the $$Low$$ and $$High$$ groups based on subprime loans. Consistent with a sentiment hypothesis, cities with greater subprime lending experience a greater effect of sentiment on house price changes. 3.2.4 Price of homes The tests in this section thus far have taken advantage of the cross-sectional differences in housing demand based across cities to tests predicted differences in sentiment. Nonetheless, since there exists different types of buyers within markets as well, we can also exploit cross-sectional differences in informed demand within a city. If the above results are evidence of an effect of sentiment, then varying levels of informed agents within local markets should imply varying impacts of sentiment as well. One way to identify different type of agents within a local city is through the separate markets they participate in. The most obvious and straightforward way to distinguish these markets is through price. Buyers with lower incomes would only qualify for lower-priced homes, while higher-income buyers would be more active in higher-priced homes. Since minority households are often in the lower quartiles of the income distribution, Lusardi and Mitchell (2007) find that lower-income profiles similarly report lower financial literacy levels. Thus, like in the other tests in this section, a theory of sentiment would suggest that prices in markets dominated by buyers with less financial education should be more greatly affected by sentiment than those among a higher income profile. FHFA house prices used in the estimations thus far only provide an index of the on average change in all house price levels for a metropolitan area. The Case-Shiller home price indices, however, also divide their indices into three price tiers: low, medium, and high. The tiers are calculated to be comparable across metro areas, so that a low-tier reflects the bottom third of sale prices while the high tier indicates sales in the top third of home prices. Thus in Columns 4 and 5 of Table 6, I again re-estimate the main specification in Equation (2) but replace overall prices with first low-tier and then high-tier prices across metropolitan areas respectively. Note that since the Case-Shiller home price index only tracks house prices for 20 metropolitan areas, estimations in Columns 4 and 5 are limited to this sample. Columns 4 and 5 show that estimated effects of sentiment are positive and significant in magnitude relative to the average quarterly price growth in both low and high-priced homes. Column 4 reveals the total estimated effect of sentiment on the low-tier priced homes, however, is much higher relative to high-priced homes across cities. Estimates are not only statistically significant but very large in magnitude, reporting nearly 1.4 percentage point response in quarterly price growth to past changes in sentiment. In contrast, Column 5 shows that estimates for high priced homes are smaller, reporting results of nearly half the magnitude of those for lower-priced homes ($$\beta=0.81$$). Therefore, these estimates confirm that results are not only consistent with predictive effects of sentiment cross-sectionally across cities, but among market segments within cities as well. 4. Conclusion While there has been much discussion and interest in the role of “animal spirits” in the most recent housing crisis, empirical tests of this hypothesis have been limited due to the lack of sentiment measures for the housing market. Any measure of expectations is naturally difficult to construct, and survey measures are expensive to implement and therefore limited in geographic scope and/or frequency. Housing markets are driven by local factors, however, and understanding why some markets experienced big price movements and others did not in the last housing cycle subsequently requires variables with cross-sectional variation. This paper contributes the first real-time measures of local housing sentiment across 34 major metropolitan markets by quantifying the tone of local housing news. Specifically, I capture the share of positive minus negative words across local housing newspaper articles. I find that patterns in my measure of media housing sentiment appear to lead movements in house prices. In cities with big boom and busts of house prices, I find that media sentiment peaks in approximately 2004 while house prices peak in 2006. This leading pattern is also reflected in the Michigan Survey of Consumers (national) measure of housing confidence, which peaks slightly ahead of my composite-20 media housing sentiment index in 2004. I am also able to validate four of my local city sentiment measures against surveys of housing expectations from Case, Shiller, and Thompson (2012). Though only available annually, Case, Shiller, and Thompson (2012) measures exhibit similar leading patterns across cities and correlate highly with my media sentiment indices. In some ways, this pattern seems to contradict the perception that buyers were positive up through 2006 before prices began to fall. Articles in my media sentiment index are still positive through 2006, however, but reach its positive peak in 2004. Similarly, in both the Michigan SOC and Case, Shiller, and Thompson (2012), respondents are still very positive up to 2006, but their expectations are at their highest in 2004. I find that changes in my measure of media sentiment have significant predictive power for future house price growth. The media housing sentiment index explains a significant amount of variation in house price changes above and beyond a set of observed economic factors that have been shown to predict house prices historically. These traditional factors appear to explain more variation in cities with more stable house price appreciation, while media sentiment accounts for more of the variation in cities with large swings in house prices. This effect remains robust to the inclusion of additional controls for subprime lending and the availability of easy credit. The structure of the media sentiment index itself reflects a backward-looking structure consistent with extrapolative expectations proposed in behavioral finance theories. These results are consistent with two interpretations of the housing media sentiment index. The housing media index could proxy for investor sentiment in the housing market, or instead proxy for difficult-to-measure fundamentals that are instead driving housing prices. Note that regardless of interpretation, the housing media sentiment index provides an useful methodology to measure unobservable factors in the housing market. Nonetheless, the effect of the media sentiment index on house prices does not appear to be driven by news stories that discuss housing fundamentals. Media sentiment also has a greater effect on house prices among those markets with a larger minority homebuyer presence, more speculative investors, and across lower priced homes. The predictive effect is also amplified in markets where more subprime loans were approved and taken out. I interpret these results as supportive of a sentiment interpretation of the housing market, and less consistent with an informational interpretation of the media index. A story of information would instead have to account for why certain unobserved fundamentals would have greater effects in markets with more homebuyers vulnerable to sentiment than others. The amplified effects of sentiment and subprime lending provide a potential explanation for why prior studies have been unable to account for the magnitude of house price changes with the share of minority buyers or expansion of easy credit alone. Without a strictly exogenous instrument for sentiment, this paper makes a careful effort to avoid making any conclusions about causality. Causes of the most recent housing cycle most likely cannot be summarized by one single factor, and the cross-sectional analysis of this paper suggests that the driving factors behind the last boom are more complicated. Expectations and fundamentals likely have a more complex relationship; for example, where a subset of homebuyers may systematically overreact to a positive shock from lower interest rates or increases in credit supply. These results strongly suggest that sentiment should be taken seriously as a potential determinant of house prices and deserves greater attention in future research and policy concerns. In particular, the results of this paper suggest future work might address a greater understanding of what specific factors drive sentiment, whether the media plays a role in perpetuating financial mistakes, and if these factors can be improved with current financial education and literacy policies. I thank Fernando Ferreira, Joe Gyourko, Olivia Mitchell, Michael Roberts, and Todd Sinai and two anonymous referees for their valuable comments and feedback. I also thank the seminar participants at Wharton; UC-Berkeley; George Mason University; Harvard Business School; HEC Paris; University of Illinois at Urbana-Champaign; Miami University; University of Michigan; Michigan State; New York University; Washington University at St. Louis; University of Rochester Simon; Federal Reserve Banks of Boston, Chicago, and Philadelphia; the University of California-Davis Annual Symposium, the NBER Behavioral Finance Workshop, the NBER Real Estate Summer Institute, the AEA/ASSA Meetings; and the Whitebox Advisors Graduate Student Conference for their helpful suggestions and comments. I am grateful to Dataquick for providing data for this project during my graduate studies at Wharton. Supplementary data can be found on The Review of Financial Studies Web site. Footnotes 1 These lists can be found at http://www.wjh.harvard.edu/inquirer/Increas.html and http://www.wjh.harvard.edu/inquirer/rise.html. My dictionary source for synonyms is Roget 21st Century Thesaurus, 3rd Edition. 2 Full word lists are available upon request. 3 This calculation essentially treats all articles in one period as one long document; an alternative method is to calculate the share of positive and negative words in each individual article and then average across articles. I try both methods and do not find a difference in values. 4 Loughran and McDonald (2011) apply the same strategy except with a preceding word distance of three words. Textual analysis studies in the computer science field use a preceding distance of five words, so I opt for the wider window. 5 Details on alternative versions and their correlations with the baseline index are available on request. 6 The data for these proxies come from DataQuick, a proprietary transaction deeds records database. 7 See http://www.elliemae.com/. 8 I find sentiment has predictive power for prices up to $$k=8$$, but limit the lag structure to four quarters to conserve degrees of freedom. The Online Appendix provides the results with an eight-quarter lag structure. 9 For full point estimates, please refer to the Online Appendix. 10 Specifically, I control for loan-to-value ratios, the fraction of subprime loans following Ferreira, Gyourko, and Tracy (2010), and the 6-month London Interbank Offered Rate (LIBOR). Please see the Online Appendix for full regression results. 11 Similar strategies are used in to measure speculators in Ferreira and Gyourko (2012), Chinco and Mayer (2016), and Bayer, Gelssler, and Roberts (2011). References Agarwal, S., Duchin, R. Evanoff, D. and Sosyura. D. 2012 . In the mood for a loan: The causal effect of sentiment on credit origination. Working Paper . Akerlof, G., and Shiller. R. J. 2009 . Animal spirits: How human psychology drives the economy and why it matters for global capitalism . Princeton, NJ : Princeton University Press . Altonji, J. G., Elder, T. E. and Taber. C. R. 2005 . Selection on observed and unobserved variables: Assessing the effectiveness of catholic schools. Journal of Political Economy 113 : 151 – 84 . Google Scholar Crossref Search ADS Angriest, J. D., and Krueger. A. B. 1999 . Empirical strategies in labor economics. In Handbook of labor economics , 1277 – 366 . Eds. Card D. and Ashenfelter. O. Amsterdam : Elsevier . Antweiler, W., and Murray. F. Z. 2004 . Is all that talk just noise? The information content of internet stock message boards. Journal of Finance 59 : 1259 – 94 . Google Scholar Crossref Search ADS Baker, M., and Wurgler. J. Investor sentiment and the cross-section of stock returns. Journal of Finance 61 : 1645 – 80 . Crossref Search ADS Baker, M., and Wurgler. J. Investor sentiment in the stock market. Journal of Economic Perspectives 21 : 129 - 51 . Crossref Search ADS Barberis, N., Shleifer, A. and Vishny. R. A model of investor sentiment. Journal of Financial Economics 49 : 307 – 43 . Crossref Search ADS Bayer, P., Gelssler, C. and Roberts. J. W. Speculators and middlemen: The role of flippers in the housing market. Working Paper , National Bureau of Economics Research . Campbell, J. Y., and Kyle. A. S. 1993 . Smart money, noise trading and stock price behaviour. Review of Economic Studies 60 : 1 – 34 . Google Scholar Crossref Search ADS Campbell, J. Y., Giglio, S. and Pathak. P. 2011 . Forced sales and house prices. American Economic Review 101 : 2108 – 31 . Google Scholar Crossref Search ADS Case, K. E., and Shiller. R. J. 1988 . The behavior of home buyers in boom and post-boom markets. New England Economic Review November/December : 29 – 46 . Case, K. E., and Shiller. R. J. 1989 . The efficiency of the market for single-family homes. American Economic Review 79 : 125 – 37 . Case, K. E., and Shiller. R. J. 2003 . Is there a bubble in the housing market? Brookings Paper on Economic Activity . Case, K. E., Shiller, R. J. and Thompson. A. 2012 . What have they been thinking? Home buyer behavior in hot and cold markets. Brookings Paper on Economic Activity . Chinco, A., and Mayer. C. 2016 . Misinformed speculators and misplacing in the housing market. Review of Financial Studies . 29 : 486 – 522 . Google Scholar Crossref Search ADS De Long, J. B., Shleifer, A. Summers, L. H. and Waldmann. R. J. 1990a . Noise trader risk in financial markets. Journal of Political Economy 98 : 703 – 38 . Google Scholar Crossref Search ADS De Long, J. B., Shleifer, A. Summers, L. H. and Waldmann. R. J. 1990b . Positive feedback investment strategies and destabilizing rational speculation. Journal of Finance 45 : 379 – 95 . Google Scholar Crossref Search ADS Demyanyk, Y., and Hemert. O. 2011 . Understanding the subprime mortgage crisis. Review of Financial Studies 24 : 1848 – 80 . Google Scholar Crossref Search ADS Dougal, C., Engelbert, J. Garcia, D. and Parsons. C. A. 2012 . Journalists and the stock market. Review of Financial Studies 25 : 639 – 79 . Google Scholar Crossref Search ADS Driscoll, J. C., and Kraay. A. C. 1998 . Consistent covariance matrix estimation with spatially dependent panel data. Review of Economics and Statistics 80 : 549 – 60 . Google Scholar Crossref Search ADS Engelberg, J. 2008 . Costly information processing: Evidence from earnings announcements. Working Paper . Feldman, R., Joshua, G. and Segal. B. 2008 . The incremental information content of tone change in management discussion and analysis. Working Paper . Ferreira, F., and Gyourko. J. 2011 . Anatomy of the beginning of the housing boom: U.S. neighborhoods and metropolitan areas. Working Paper , National Bureau of Economics Research . Ferreira, F., and Gyourko. J. Heterogeneity in neighborhood-level price growth in the United States, 1993-2009. American Economic Review 102 : 134 – 40 . Crossref Search ADS Ferreira, F., Gyourko., J. and Tracy. J. 2010 . Housing busts and household mobility. Journal of Urban Economics 68 : 34 – 45 . Google Scholar Crossref Search ADS Foote, C. 2007 . Space and time in macroeconomics panel data: young workers and state-level unemployment revisited. Working Paper , Federal Reserve Bank of Boston . Foote, C. L., Gerardi, K. and Willen. P. S. 2008 . Negative equity and foreclosure: Theory and evidence. Journal of Urban Economics 64 : 234 – 45 . Google Scholar Crossref Search ADS Galbraith, J. 1990 . A short history of financial euphoria . New York : Viking Press . Gentzkow, M., and Shapiro. J. M. 2010 . What drives media slant? Evidence from U.S. daily newspapers. Econometrica . 78 : 35 – 71 . Google Scholar Crossref Search ADS Gerardi, K., Lehnert, A. Sherlund, S. and Willen. P. 2008 . Making sense of the subprime crisis. Brookings Paper on Economic Activity . Glaeser, E., Gottlieb, J. and Gyourko. J. 2010 . Can cheap credit explain the housing boom? Working Paper , National Bureau of Economics Research . Goetzmann, W., Peng, L. and Yen. J. 2012 . The subprime crisis and house price appreciation. Journal of Real Estate Finance and Economics 44 : 36 – 66 . Google Scholar Crossref Search ADS Greenwood, R., and Shleifer. A. 2014 . Expectations of returns and expected returns. Review of Financial Studies 27 : 714 ×΄ 6 . Google Scholar Crossref Search ADS Hanley, E., and Hoberg. G. 2010 . The information content of IPO prospectuses. Review of Financial Studies 23 : 2821 – 64 . Google Scholar Crossref Search ADS Himmelberg, C., Mayer, C. and Sinai. T. 2005 . Assessing high house prices: Bubbles, fundamentals, and misperceptions. Journal of Economic Perspectives 19 : 67 – 92 . Google Scholar Crossref Search ADS Jegadeesh, N., and Wu. D. 2013 . Word power: A new approach for content analysis. Journal of Financial Economics 110 : 712 – 29 . Google Scholar Crossref Search ADS Keys, B. J., Seru, A. and Vig. V. 2012 . Lender screening and the role of securitization: Evidence from subprime loans. Review of Financial Studies 25 : 2071 – 108 . Google Scholar Crossref Search ADS Keynes, J. M. 1936 . The general theory of employment, interest and money . London : Macmillan . Kindleberger, C. 1936 . Manias, panics, and charts: A history of financial crises . Hoboken, NJ : John Wiley and Sons . Lai, R. N., and Van Order. R. A. Momentum and house price growth in the United States: Anatomy of a bubble. Real Estate Economics 38 : 753 – 73 . Crossref Search ADS Li, F. 2006 . Do stock market investors understand the risk sentiment of corporate annual reports? Working Paper . Loughran, T., and Mcdonald. B. 2011 . When is liability not a liability? Textual analysis, dictionaries, and 10-Ks. Journal of Finance 66 : 35 – 65 . Google Scholar Crossref Search ADS Lusardi, A., and Mitchell. O. 2007 . Baby boomer retirement security: The roles of planning, financial literacy, and housing wealth. Journal of Monetary Economics 54 : 205 ×2 2 . Google Scholar Crossref Search ADS Mayer, C., and Sinai. T. 2009 . U.S. house price dynamics and behavioral finance. In Policy making insights from behavioral economics , chapter 5 . Eds. Foote, C. Goethe, L. and Meier. S. Boston : Federal Reserve Bank of Boston . Mian, A., and Sufi. A. 2009 . The consequences of mortgage credit expansion: Evidence from the U.S. mortgage default crisis. Quarterly Journal of Economics 124 : 1449 – 96 . Google Scholar Crossref Search ADS Mian, A., and Sufi. A. 2011 . House prices, home equity-based borrowing and the U.S. household leverage crisis. American Economic Review 101 : 2132 – 56 . Google Scholar Crossref Search ADS Mullainathan, S., and Shleifer. A. 2005 . The market for news. American Economic Review 95 : 1031 – 53 . Google Scholar Crossref Search ADS Nakajima, M. 2005 . Rising earnings instability, portfolio choice, and housing prices. Working Paper . Nakajima, M. 2009 . Understanding house price dynamics. Business Review Q2 : 20 – 28 . Newey, W., and West. K. 1987 . A simple, positive, semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica 78 : 35 – 71 . Piazzesi, M., and Schneider. M. 2009 . Momentum traders in the housing market: Survey evidence and a search model. American Economic Review 99 : 406 – 11 . Google Scholar Crossref Search ADS Roback, J. 1982 . Wages, rents, and quality of life. Journal of Political Economy 90 : 1257 – 78 . Google Scholar Crossref Search ADS Rosen, S. 1979 . Wage-based indexes of urban quality of life. In Current issues in urban economics . Baltimore : Johns Hopkins University Press . Shiller, R. J. 2005 . Irrational exuberance . Princeton, NJ : Princeton University Press . Shiller, R. J. 2008 . The subprime solution . Princeton, NJ : Princeton University Press . Shiller, R. J. 2009 . Animal spirits . Princeton, NJ : Princeton University Press . Tetlock, P. 2007 . Giving content to investor sentiment: The role of media in the stock market. Journal of Finance 62 : 1139 – 68 . Google Scholar Crossref Search ADS Tetlock, P., Saar-Tsechansky, M. and Macskassy. S. 2008 . More than words: Quantifying language to measure firms’ fundamentals. Journal of Finance 63 : 1437 – 67 . Google Scholar Crossref Search ADS Tracy J., Schneider, H. and Chan. S. 1999 . Are stocks overtaking real estate in household portfolios? Current Issues in Economics and Finance 5 : 1 – 6 . © The Author(s) 2018. Published by Oxford University Press on behalf of The Society for Financial Studies. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png The Review of Financial Studies Oxford University Press

# Quantifying Sentiment with News Media across Local Housing Markets

The Review of Financial Studies, Volume 31 (10) – Oct 1, 2018
31 pages

/lp/ou_press/quantifying-sentiment-with-news-media-across-local-housing-markets-CjgoCudKbe
Publisher
Oxford University Press
ISSN
0893-9454
eISSN
1465-7368
D.O.I.
10.1093/rfs/hhy036
Publisher site
See Article on Publisher Site

### Abstract

Abstract This paper develops first measures of housing sentiment for 34 cities across the United States by quantifying the qualitative tone of local housing news. Housing media sentiment has significant predictive power for future house prices, leading prices by nearly two years. Consistent with theories of investor sentiment, the media sentiment index has a greater effect in markets in which speculative investors are more prevalent and demand appears less informed. Directly examining the content across news articles finds that results are not driven by news stories of unobserved fundamentals. Received October 21, 2015; editorial decision December 19, 2017 by Editor Robin Greenwood. Authors have furnished an Internet Appendix, which is available on the Oxford University Press Web site next to the link to the final published paper online. Sentiment, broadly defined as the psychology behind investor beliefs, has been long posited as a determinant of asset price variation (Keynes 1936). The potential role of sentiment is particularly important for the housing market, where financial errors can have very large consequences. Over two-thirds of households in the United States own a home and invest the majority of their portfolio in real estate (Tracy, Schneider, and Chan 1999, Nakajima 2005). The housing bust in 2006 devastated millions of homeowners across the United States and overwhelmed banks and financial institutions that held significant investments in mortgage-backed securities and other housing related assets. The collapse of the subprime mortgage industry quickly followed, leading the nation into its worst economic recession since the 1930s. Shiller (2009) and others notably argue that “animal spirits” or an irrational exuberance of investors was a significant factor in the dramatic boom and bust of house prices in some markets. Testing this theory, however, is difficult without empirical measures of housing sentiment. Beliefs are not only unobservable and therefore not straightforward to quantify but also sentiment measures for the housing market are particularly difficult to obtain. Typical sentiment proxies for the stock market, such as mutual fund flows, dividend premiums, and closed-end fund discounts, are naturally not available for the housing market (Baker and Wurgler 2006). Survey measures that are available are limited in geographic scope. While many cities experienced unprecedented price growth in the last housing cycle, many others simultaneously saw minimal or stable movements in prices (Ferreira and Gyourko 2012). Therefore understanding the variation of price growth across local markets requires analogous city-level measures. The goal of this paper is to provide real-time local measures of sentiment across a number of city housing markets. I construct 34 city-specific housing sentiment indices corresponding to major metropolitan areas across the United States. I measure housing sentiment by quantifying the qualitative tone of local housing news media coverage. This strategy is motivated by seminal literature on asset price bubbles that argues news media has a prominent relationship with sentiment through an incentive to cater to readers’ preferences (Kindleberger 1978, Galbraith 1990, Shiller 2005). This methodology is further supported by pioneering empirical work by Tetlock (2007) and others who have found media tone to be a consistent proxy for sentiment in the stock market. By analyzing local housing news articles each month, I quantify the share of the positive and negative local media tone in each market from 2000 to the end of 2013. To my knowledge, this paper contributes the first set of city measures of housing sentiment across several markets. Housing media sentiment indices move ahead of price levels at a significant lead. Cities that experienced dramatic rises and declines in house prices are preceded by similar cycles in sentiment, whereas cities with milder price changes are led by more subdued or random patterns in sentiment growth. In cities with large swings in prices, sentiment appears to peak approximately some time in 2004. Though only available at a national and limited city sample, I validate my sentiment index against two available surveys of housing market expectations. I find that my media sentiment index highly correlates with the national survey and with the four cities assessed by the Case, Shiller, and Thompson (2012) survey. Both surveys also see expectations peak in early 2004, at a similar lead-lag pattern to house price levels. I further validate my sentiment index against an alternative multifactor index created with a set of proxies following the methodology of Baker and Wurgler (2007) and find that these also correlate highly. Consistent with these leading patterns, I find that housing media sentiment has robust predictive power for future house price growth. I find these effects to be both statistically significant and large in economic magnitude. This is notable as historical predictive factors have had difficulty explaining the wide swings in house prices during this period. Fundamental determinants that traditionally explained past patterns in house prices account for just a small fraction of changes post-2000 (Lai and Van Order 2010). From 2004 to 2006, for example, house prices in Miami increased by nearly 54%. Observed economic fundamentals account for approximately 21 percentage points of this growth, while the media sentiment index explains an additional 32 percentage points. These results suggest that media sentiment serves as a useful predictive factor for house prices above and beyond traditional observed variables. The positive association between the media sentiment index and future house prices is consistent with two different interpretations. One explanation is that these results capture the effects of sentiment, where over-exuberant beliefs pushed prices away from fundamentals. Examining the structure of the media sentiment index more closely reveals a backward-looking nature over past returns consistent with behavioral theories of extrapolative expectations. An alternative interpretation is that local unobserved fundamentals simultaneously drove house prices and beliefs such that the expectations of home buyers were justified by local information. One could even argue that the leading pattern of the media sentiment index resembles a perfect forecast in some cities. The media sentiment index could instead serve as a valuable indicator of local market information that was otherwise unobserved. I subsequently perform a series of tests to explore these two competing explanations. I first exploit the textual nature of my data to directly evaluate the media index as a proxy for local information. By tracking all words across all articles, I can examine the content across articles firsthand. I then separate the articles to create a set of “media fundamentals” indices that track the positive and negative tone across articles that discuss any relevant housing market information. I find that the effect of the housing sentiment index remains extremely robust to a discussion of fundamentals, both separately and together. The only set of articles that shares some predictive power with the media sentiment index are those that discuss credit conditions. The inclusion of additional controls for subprime lending patterns and the availability of credit, however, does not affect the significance and magnitude of the predictive effect of the housing media sentiment index. In the final section, I take advantage of the availability of the cross-sectional indices to test effects that should be consistent with an interpretation of sentiment. Baker and Wurgler (2006) highlight two channels through which theory predicts sentiment has cross-sectional effects on prices: (1) where demand is less informed and (2) where arbitrage constraints are particularly binding. Since arbitrage is particularly binding in the housing market, theoretical models of investor sentiment predict that markets with less-informed demand should be subject to greater effects of sentiment. Shiller (2009) and others have raised concerns particularly over the disproportionate lack of access to financial advice available to minority and low-income households. Consistent with these arguments, I find that sentiment effects are significantly greater in markets with more minority mortgage applicants and among lower-priced homes. Theories on sentiment further suggest that rational traders may respond to increased sentiment in the market to anticipate increasing asset prices. Indeed, I find that the effect of sentiment is also notably larger among housing markets with a greater entrance of speculative investors. Finally, I show that while subprime lending patterns do not affect the impact of sentiment on prices, the effect of sentiment is proportionally greater in markets where subprime lending was more prevalent. Many voiced suspicions that subprime lenders targeted those most uninformed, and we would expect those subject to sentiment to be more vulnerable to taking out risky, subprime loans. These results also provide potential context for why prior studies have been unable to explain the magnitude of the boom with the ease of credit alone, and highlight potential subprime lending effects through a relationship with sentiment. 1. The Housing Sentiment Index 1.1 Background and related literature Prominent literature on bubbles and panics stresses that the news media have an important relationship with investor beliefs (Kindleberger 1978, Galbraith 1990, Shiller 2005). They argue that newspapers have a demand-side incentive to cater to reader preferences and will reflect readers’ expectations over assets they own. Mullainathan and Shleifer (2005) formalize these arguments by assuming readers have a disutility for news that is inconsistent with their beliefs, citing psychology literature that shows people have a tendency to favor information that confirms their priors. Gentzkow and Shapiro (2010) find empirical evidence that readers have a preference for news consistent with their political beliefs and that news outlets respond accordingly. At the same time, Shiller (2005) argues that news outlets have the power to influence reader beliefs through a chosen tone or an emphasis of particular positive or negative events. Empirical studies of stock market sentiment have found consistent evidence for these arguments linking media coverage and market activity. Tetlock (2007) employs textual analysis to quantify media sentiment in the stock market and finds consistent evidence for its predictive effect on asset prices. Dougal et al. (2012) provide further causal evidence of financial media influencing subsequent investor behavior and stock market returns. Additional studies have further applied textual analysis techniques to capture the sentiment of earnings announcements, investor chat rooms, corporate 10-K reports, and initial public offering (IPO) prospectuses and linked to outcomes such as firm earnings, stock returns, and trading volume (Engelberg 2008, Antweiler and Murray 2004, Li 2006, Hanley and Hoberg 2010). Housing is an asset that receives heavy media attention as “a source of endless fascination for the general public, because we live in houses, we work on them every day” (Shiller 2005, p. 102). Housing is also a widely held investment by individual buyers who may be more likely subject to media slant than the typical stock market investor. Media tone thus presents a unique opportunity to capture expectations in a market where alternative proxies are otherwise difficult to obtain. Surveys such as Case and Shiller (2003) and the Michigan/Reuters Survey of Consumers (SOC) can provide direct assessments of home buyer expectations, but can be expensive to run and are therefore limited in frequency and/or restricted in geographic breakdown. Because news is local and recurring, the news media provides a potential medium through which we can quantify housing sentiment both across time and city by city. Examining the impact of sentiment across housing markets is particularly important after the most recent housing cycle. Standard economic explanations for the housing boom have so far been difficult to reconcile empirically (Gerardi et al. 2008, Glaeser, Gottlieb, and Gyourko 2010, Ferreira and Gyourko 2011). Observed fundamentals that accounted for nearly 70% of the variation in national house price growth from 1987 to 2000, explain less than 10% of the variation from 2000 to 2011 (Lai and Van Order 2010). While there was much discussion over the potential role of sentiment in the last housing cycle, empirical evidence of this theory has been limited and largely anecdotal because of the lack of housing sentiment measures available. This paper thus provides a first opportunity to examine the role of sentiment in housing both across markets and across time empirically. 1.2 Calculating the housing sentiment index My source for news articles is Factiva.com, a comprehensive online database of newspapers. Factiva categorizes its articles by subject, and provides a subject code that identifies articles that discuss local real estate markets. This code is determined by a proprietary algorithm that remains objective across all newspapers and years. Routine real estate property listings are not included. Wire service articles (such as those by the Associated Press) are also generally excluded, as syndicated stories cannot be redistributed and typically do not appear in the Factiva database. Thus, articles in my sample are those written by local staff reporters. I collect all news articles discussing real estate markets between January 2000 and December 2013 from the major newspaper publications of 34 cities, including the twenty cities followed by the Case-Shiller home price indices. Most cities have one major newspaper that dominates the news market, with the exception of Boston, Detroit, and Los Angeles, which have two. Table 1 lists all newspaper sources for each city. I extract the full text of each article and record each individual word with its corresponding date, word position, and originating newspaper. My final data set assembles 37,537 newspaper articles, consisting of over 29 million words. Table 1 Major newspaper publications by city City Newspaper Atlanta The Atlanta Journal-Constitution Austin Austin American-Statesman Baltimore The Baltimore Sun Boston Boston Herald/Boston Globe Charlotte The Observer Chicago Chicago Tribune Cincinnati The Cincinnati Enquirer Cleveland The Plain Dealer Columbus (OH) The Columbus Dispatch Dallas The Dallas Morning News Denver The Denver Post Detroit Detroit News/Detroit Free Press Indianapolis The Indianapolis Star Kansas City The Kansas City Star Los Angeles LA Times/LA Daily News Las Vegas Las Vegas Review-Journal Milwaukee The Milwaukee Journal Sentinel Minneapolis Star Tribune Miami The Miami Herald NYC New York Times Orlando Orlando Sentinel Philadelphia The Philadelphia Inquirer Phoenix The Arizona Republic Pittsburgh Pittsburgh Post-Gazette Portland The Oregonian Sacramento Sacramento Bee San Antonio San Antonio Express-News San Diego The San Diego Union-Tribune San Francisco The San Francisco Chronicle San Jose San Jose Mercury News Seattle The Seattle Times St. Louis St. Louis Post-Dispatch Tampa Tampa Tribune Washington, D.C The Washington Post City Newspaper Atlanta The Atlanta Journal-Constitution Austin Austin American-Statesman Baltimore The Baltimore Sun Boston Boston Herald/Boston Globe Charlotte The Observer Chicago Chicago Tribune Cincinnati The Cincinnati Enquirer Cleveland The Plain Dealer Columbus (OH) The Columbus Dispatch Dallas The Dallas Morning News Denver The Denver Post Detroit Detroit News/Detroit Free Press Indianapolis The Indianapolis Star Kansas City The Kansas City Star Los Angeles LA Times/LA Daily News Las Vegas Las Vegas Review-Journal Milwaukee The Milwaukee Journal Sentinel Minneapolis Star Tribune Miami The Miami Herald NYC New York Times Orlando Orlando Sentinel Philadelphia The Philadelphia Inquirer Phoenix The Arizona Republic Pittsburgh Pittsburgh Post-Gazette Portland The Oregonian Sacramento Sacramento Bee San Antonio San Antonio Express-News San Diego The San Diego Union-Tribune San Francisco The San Francisco Chronicle San Jose San Jose Mercury News Seattle The Seattle Times St. Louis St. Louis Post-Dispatch Tampa Tampa Tribune Washington, D.C The Washington Post Table 1 lists each city and its corresponding newspaper source in my sample of housing news articles ($${\rm N}=37,537$$). I draw from one major newspaper publication for most cities, with the exception of Boston, Detroit, and Los Angeles, in which I draw from the two major newspapers in the area. My sample covers articles from January 2000 to December 2013. Table 1 Major newspaper publications by city City Newspaper Atlanta The Atlanta Journal-Constitution Austin Austin American-Statesman Baltimore The Baltimore Sun Boston Boston Herald/Boston Globe Charlotte The Observer Chicago Chicago Tribune Cincinnati The Cincinnati Enquirer Cleveland The Plain Dealer Columbus (OH) The Columbus Dispatch Dallas The Dallas Morning News Denver The Denver Post Detroit Detroit News/Detroit Free Press Indianapolis The Indianapolis Star Kansas City The Kansas City Star Los Angeles LA Times/LA Daily News Las Vegas Las Vegas Review-Journal Milwaukee The Milwaukee Journal Sentinel Minneapolis Star Tribune Miami The Miami Herald NYC New York Times Orlando Orlando Sentinel Philadelphia The Philadelphia Inquirer Phoenix The Arizona Republic Pittsburgh Pittsburgh Post-Gazette Portland The Oregonian Sacramento Sacramento Bee San Antonio San Antonio Express-News San Diego The San Diego Union-Tribune San Francisco The San Francisco Chronicle San Jose San Jose Mercury News Seattle The Seattle Times St. Louis St. Louis Post-Dispatch Tampa Tampa Tribune Washington, D.C The Washington Post City Newspaper Atlanta The Atlanta Journal-Constitution Austin Austin American-Statesman Baltimore The Baltimore Sun Boston Boston Herald/Boston Globe Charlotte The Observer Chicago Chicago Tribune Cincinnati The Cincinnati Enquirer Cleveland The Plain Dealer Columbus (OH) The Columbus Dispatch Dallas The Dallas Morning News Denver The Denver Post Detroit Detroit News/Detroit Free Press Indianapolis The Indianapolis Star Kansas City The Kansas City Star Los Angeles LA Times/LA Daily News Las Vegas Las Vegas Review-Journal Milwaukee The Milwaukee Journal Sentinel Minneapolis Star Tribune Miami The Miami Herald NYC New York Times Orlando Orlando Sentinel Philadelphia The Philadelphia Inquirer Phoenix The Arizona Republic Pittsburgh Pittsburgh Post-Gazette Portland The Oregonian Sacramento Sacramento Bee San Antonio San Antonio Express-News San Diego The San Diego Union-Tribune San Francisco The San Francisco Chronicle San Jose San Jose Mercury News Seattle The Seattle Times St. Louis St. Louis Post-Dispatch Tampa Tampa Tribune Washington, D.C The Washington Post Table 1 lists each city and its corresponding newspaper source in my sample of housing news articles ($${\rm N}=37,537$$). I draw from one major newspaper publication for most cities, with the exception of Boston, Detroit, and Los Angeles, in which I draw from the two major newspapers in the area. My sample covers articles from January 2000 to December 2013. I capture media tone through a textual analysis of the content across newspaper articles. Textual analysis is an increasingly popular methodology used to quantify the tone of financial documents (Antweiler and Murray 2004, Loughran and McDonald 2011, Jegadeesh and Wu 2013, Hanley and Hoberg 2010). I apply the most standard methodology employed by this literature, which uses a dictionary-based method to quantify the raw frequency of positive and negative words in a text. To do so, these papers typically identify words as positive or negative based on an external word list. External word lists such as those from the Harvard IV-4 Psychological Dictionary are preferred because they are predetermined and less vulnerable to subjectivity from the author. Recent studies have argued, however, that these general tonal lists can at times contain irrelevant words and lead to noisy measures (Tetlock, Saar-Tsechansky, and Macskassy 2008). For example, Engelberg (2008) points out that Harvard IV-4 positive list, which contains word such as company or shares, can unintentionally capture other effects in finance applications. Loughran and McDonald (2011) show that the noise introduced by the general Harvard negative word list can also be substantial and argues that word lists should be discipline specific to reduce measurement error. To balance these concerns, I employ a predetermined list from the Harvard IV-4 dictionary to reduce subjectivity, but choose one that is to relevant to how the media expresses positive or negative tone over housing. Shiller (2008) suggests that the media embellishes market activity by employing superlatives that emphasize increases and upward movements. For example, my sample includes articles with headlines such as “Home Sales Skyrocket!”, “Home Prices Zoom Up”, or “Housing is HOT, HOT, HOT!!” Thus to capture words like skyrocket, zoom, or hot, I use the Harvard IV-4 lists Increase and Rise as a base set.1 I remove any words from the original list that would result in obvious misclassifications, and then expand the remaining words with corresponding superlatives. Following Loughran and McDonald (2011), I also expand the list with inflections and tenses that retain the original meaning of each word. Word counts for the root word skyrocket, for example, also include skyrockets, skyrocketed, and skyrocketing. The original Harvard IV-4 lists include 136 words and the expanded list, including inflections and synonyms, contains 403 words.2 I repeat the above process to create negative word lists using the converse Harvard IV-4 lists Decrease and Fall. Table 2 reports the top twenty most frequently occurring positive and negative words in my sample. Table 2 Top twenty most frequent positive and negative words Positive % of pos words Negative % of neg words 1 UP 9.05 DOWNS 16.30 2 HIGHS 5.93 LOW 11.91 3 INCREASING 5.01 FALLING 10.27 4 RISE 4.58 DECLINING 8.58 5 GREAT 4.34 DROPPING 7.68 6 SUSTAINS 4.08 FORECLOSING 4.45 7 MOST 3.44 SLOW 3.78 8 BIGGEST 3.33 CONTRACT 2.57 9 FRENZINESS 2.92 RECESSION 2.04 10 FASTEST 2.76 BUBBLE 1.53 11 BEST 2.57 DIPS 1.44 12 PUSHING 2.51 CONCERNING 1.32 13 RECORD 2.34 FLATTENING 1.25 14 HEALTHIER 2.22 SLUMPED 1.21 15 STRENGTHEN 2.02 WORRYING 1.21 16 GOOD 1.98 STOPPING 1.18 17 BOOMING 1.84 REDUCTION 1.17 18 WELL 1.83 COOL 1.11 19 GAIN 1.63 CRISIS 1.03 20 VERY 1.48 WEAKEN 1.00 Positive % of pos words Negative % of neg words 1 UP 9.05 DOWNS 16.30 2 HIGHS 5.93 LOW 11.91 3 INCREASING 5.01 FALLING 10.27 4 RISE 4.58 DECLINING 8.58 5 GREAT 4.34 DROPPING 7.68 6 SUSTAINS 4.08 FORECLOSING 4.45 7 MOST 3.44 SLOW 3.78 8 BIGGEST 3.33 CONTRACT 2.57 9 FRENZINESS 2.92 RECESSION 2.04 10 FASTEST 2.76 BUBBLE 1.53 11 BEST 2.57 DIPS 1.44 12 PUSHING 2.51 CONCERNING 1.32 13 RECORD 2.34 FLATTENING 1.25 14 HEALTHIER 2.22 SLUMPED 1.21 15 STRENGTHEN 2.02 WORRYING 1.21 16 GOOD 1.98 STOPPING 1.18 17 BOOMING 1.84 REDUCTION 1.17 18 WELL 1.83 COOL 1.11 19 GAIN 1.63 CRISIS 1.03 20 VERY 1.48 WEAKEN 1.00 The word counts for each listed word includes different tenses and inflections. So, for example, “boom” includes counts for “booms,” “boomed,” and “booming.” Table 2 Top twenty most frequent positive and negative words Positive % of pos words Negative % of neg words 1 UP 9.05 DOWNS 16.30 2 HIGHS 5.93 LOW 11.91 3 INCREASING 5.01 FALLING 10.27 4 RISE 4.58 DECLINING 8.58 5 GREAT 4.34 DROPPING 7.68 6 SUSTAINS 4.08 FORECLOSING 4.45 7 MOST 3.44 SLOW 3.78 8 BIGGEST 3.33 CONTRACT 2.57 9 FRENZINESS 2.92 RECESSION 2.04 10 FASTEST 2.76 BUBBLE 1.53 11 BEST 2.57 DIPS 1.44 12 PUSHING 2.51 CONCERNING 1.32 13 RECORD 2.34 FLATTENING 1.25 14 HEALTHIER 2.22 SLUMPED 1.21 15 STRENGTHEN 2.02 WORRYING 1.21 16 GOOD 1.98 STOPPING 1.18 17 BOOMING 1.84 REDUCTION 1.17 18 WELL 1.83 COOL 1.11 19 GAIN 1.63 CRISIS 1.03 20 VERY 1.48 WEAKEN 1.00 Positive % of pos words Negative % of neg words 1 UP 9.05 DOWNS 16.30 2 HIGHS 5.93 LOW 11.91 3 INCREASING 5.01 FALLING 10.27 4 RISE 4.58 DECLINING 8.58 5 GREAT 4.34 DROPPING 7.68 6 SUSTAINS 4.08 FORECLOSING 4.45 7 MOST 3.44 SLOW 3.78 8 BIGGEST 3.33 CONTRACT 2.57 9 FRENZINESS 2.92 RECESSION 2.04 10 FASTEST 2.76 BUBBLE 1.53 11 BEST 2.57 DIPS 1.44 12 PUSHING 2.51 CONCERNING 1.32 13 RECORD 2.34 FLATTENING 1.25 14 HEALTHIER 2.22 SLUMPED 1.21 15 STRENGTHEN 2.02 WORRYING 1.21 16 GOOD 1.98 STOPPING 1.18 17 BOOMING 1.84 REDUCTION 1.17 18 WELL 1.83 COOL 1.11 19 GAIN 1.63 CRISIS 1.03 20 VERY 1.48 WEAKEN 1.00 The word counts for each listed word includes different tenses and inflections. So, for example, “boom” includes counts for “booms,” “boomed,” and “booming.” I calculate the overall tone of housing news in each city $$i$$ and period $$t$$ by $$S_{it}=\frac{\#pos-\#neg}{\#totalwords}_{it},$$ (1) that is, the number of positive minus negative words divided by the total number of words across all housing articles in each period.3 I additionally adjust both negative and positive word counts for negation using the terms: no, not, none, neither, never, and nobody. I consider a word negated if it is preceded within five words by one of these negation terms.4 The above calculation represents the most raw and baseline index of media tone. Loughran and McDonald (2011) propose an alternative “term-weighted” index that also adjusts for the commonality and frequency of a word across documents. I find that this and other alternative variations are highly correlated with the raw baseline version above.5 1.3 Survey validation Validating measures of housing sentiment is by nature challenging when beliefs are unobservable and alternative proxies are otherwise rare. Existing surveys of home buyer confidence cannot validate each measure city by city, but can be used to compare overall trends on the national level. The University of Michigan/Reuters Survey of Consumers (SOC) surveys a nationally representative sample of 500 individuals each month on their attitudes toward business and buying conditions, including those of the housing market. Specifically, the SOC asks, “Generally speaking, do you think now is a good or bad time to buy a house?” Respondents answer “yes,” “no, “or “do not know.” Figure 1 plots the percentage of respondents that answered “yes” against a national version of my housing expectations index. I calculate a national index of media tone using the same weights applied to the twenty cities in Case-Shiller Composite-20 home price index. My measure of housing media sentiment reveals a strikingly similar pattern to the SOC survey measure. Both measures increase rapidly from 2000 to 2003, peaking in early 2004. Both fall rapidly from 2004 to 2006, dropping well below original levels of confidence in 2000. Media sentiment appears to generally lag survey confidence on average by 2 to 6 months, consistent with theories that the media responds to reader preferences in the housing market. Both measures rebound early 2008, peaking and declining slightly again in late 2009. Both of the increases occur before the temporary rebound of the housing market in 2009, but fall slightly again afterward. The correlation between the SOC survey and my media housing sentiment index is equal to approximately 0.84. Figure 1 View largeDownload slide Validating media housing sentiment with survey of consumers This figure compares the patterns of the composite-20 national housing media sentiment index with the Michigan/Reuters Survey of Consumers (SOC) survey of home buyers. The SOC surveys a nationally representative sample of 500 consumers and asks whether they think it is a good or bad time to buy a home. The SOC cannot be broken down by city, but provides a validation of the media sentiment index on a national level. The dashed line plots the SOC relative index, which equals the percentage who answered “Good” - ”Bad” + 100. The solid line plots the housing media sentiment index, which equals the fraction of “Positive” - “Negative” words across all housing news articles per month. The media sentiment index lags the SOC survey index by a few months but mirrors a very similar pattern. The correlation between the SOC survey and (lagged) media sentiment is approximately 0.84. Figure 1 View largeDownload slide Validating media housing sentiment with survey of consumers This figure compares the patterns of the composite-20 national housing media sentiment index with the Michigan/Reuters Survey of Consumers (SOC) survey of home buyers. The SOC surveys a nationally representative sample of 500 consumers and asks whether they think it is a good or bad time to buy a home. The SOC cannot be broken down by city, but provides a validation of the media sentiment index on a national level. The dashed line plots the SOC relative index, which equals the percentage who answered “Good” - ”Bad” + 100. The solid line plots the housing media sentiment index, which equals the fraction of “Positive” - “Negative” words across all housing news articles per month. The media sentiment index lags the SOC survey index by a few months but mirrors a very similar pattern. The correlation between the SOC survey and (lagged) media sentiment is approximately 0.84. Case, Shiller, and Thompson (2012) implement surveys that provide even more detailed perspectives on investor expectations. They directly ask respondents how much they expect their house price to grow over the next ten years. Answers reveal astonishingly high expectations; with respondents expecting prices to rise an average of 11% to 13% each year. Survey expenses limit coverage to four suburban areas and an annual snapshot, but nonetheless provide a valuable opportunity to validate my media index on a local level for at least four cities. Figure 2 plots the Case-Shiller survey measures against my housing sentiment index for San Francisco, Los Angeles, Boston, and Milwaukee. As with national trends, my sentiment index exhibits parallel patterns to survey measures on a city level. Both the media sentiment index and survey measures of expectations in San Francisco, Los Angeles, and Boston similarly peak in 2004 and hit a low point in 2008 before rising again. Measures for Milwaukee display more moderate patterns from 2003 to 2006, both in the survey and media index. Correlations between my housing sentiment index and the Case-Shiller Survey for each city range from approximately 0.7 to 0.74. Figure 2 View largeDownload slide Validating media housing sentiment with Case-Shiller surveys This figure compares the patterns of the housing media sentiment index with housing surveys by Case, Shiller, and Thompson (2012) who annually survey home buyer expectations in four cities from 2003 to 2012. The bars plot the percentage respondents think home prices will increase or decrease over the next year. The Case-Shiller survey is limited to an annual frequency, but provides a validation of the media sentiment index for four cities at the city level. The solid line plots the housing media sentiment index, which equals the percentage of positive minus negative words across all housing news articles per month. The media sentiment generally follows a very similar trending pattern city by city. The correlation between the Case-Shiller Survey and media index is equal to approximately 0.74 across cities. Figure 2 View largeDownload slide Validating media housing sentiment with Case-Shiller surveys This figure compares the patterns of the housing media sentiment index with housing surveys by Case, Shiller, and Thompson (2012) who annually survey home buyer expectations in four cities from 2003 to 2012. The bars plot the percentage respondents think home prices will increase or decrease over the next year. The Case-Shiller survey is limited to an annual frequency, but provides a validation of the media sentiment index for four cities at the city level. The solid line plots the housing media sentiment index, which equals the percentage of positive minus negative words across all housing news articles per month. The media sentiment generally follows a very similar trending pattern city by city. The correlation between the Case-Shiller Survey and media index is equal to approximately 0.74 across cities. 1.4 Multifactor index The high correlation of the sentiment index with Case, Shiller, and Thompson (2012) home buyer surveys suggest that the media sentiment reflects home buyer expectations. The housing market is more complicated, however, and has a number of other types of agents—lenders, banks, speculative investors—who are also simultaneously part of potential newspaper readership. All of these agents’ expectations could also be influenced or reflected in news media articles, and therefore also be represented by the media sentiment index. The media index thus likely represents multiple agents sentiment in the housing market, and can even represent its own, as journalists and editors are also all potential buyers or sellers in the housing market. To validate the media sentiment index further, I create a multifactor sentiment index following the methodology in Baker and Wurgler (2006) that combines sentiment factors that may come from not only homebuyers but also lenders, banks or outside investors/speculators. I employ the following five potential proxies for sentiment to capture these different aspects of housing sentiment. Transaction volume: If home buyers are extrapolating prices and buying home(s) in a frenzy, we would expect to see this activity reflected in increased trading volume. Number of 2nd+ home purchases: Similarly if investors are extrapolating prices and buying as speculators, we would see an increase in second or more multiple home purchases. During the housing boom, there were many anecdotal stories of individuals buying 2+ more homes with high expectations for future house prices. This could also capture activity of rational speculators looking to flip homes for investment, as a bet on sentiment from less sophisticated home buyers continuing to rise in the short-term. Fraction of subprime mortgages: Sentiment could be reflected not only in buyers but also in lending activity. If lenders are over-confident that prices will rise, they may be more willing to lend out riskier mortages. Following Ferreira and Gyourko (2012), this variable captures the share of loans issued by the top twenty subprime lenders ranked by Inside Mortgage Finance. Loan-to-value ratio: If lenders are also affected by high market sentiment, then this could lead to egregious lending activity captured by absurdly high loan-to-value ratios. The loan-to-value ratios I include in this multifactor index incorporate not just the primary but any additional (up to 3) loans taken out for a particular housing transaction. Price-to-rent ratio: During the boom of house prices, Shiller (2005) and others frequently cited high price-to-rent ratios as evidence of overvalued housing markets. The “fundamental value” of an asset typically refers to its present discounted value of future cash flow, which models of housing assume housing pays dividends in the form of rental services. If prices are way above fundamental value, then we would expect price growth to increase well above the simultaneous pace of rent. Following Baker and Wurgler (2006), I orthogonalize each proxy on a set of economic variables (rents, population, income, unemployment, employment, and mortgage interest rates) to remove the potential influence of local housing market fundamentals. I then extract the first principal component from the remaining residuals from each of these regressions. Following the rationale behind the Baker and Wurgler (2006) index, the first principal component then represents the common sentiment component across the five sentiment proxies and removes any remaining idiosyncrasies. The indices are highly correlated, at a lag of approximately 3 quarters. Since all of the proxies in the multifactor index are transactional outcomes in response to rising sentiment (e.g., buying or flipping a house or receiving a loan in response to positive sentiment), I find this consistent with the media capturing the positive or negative tone of sentiment in real-time and/or influencing subsequent expectations. The home-buying process includes several steps, from searching for mortgage lenders, qualifying for a loan, to initiating the search for a home. The actual search process for a home itself can also take several months from first search to an accepted offer. Thus, we would expect the multifactor index to lag somewhat to the news sentiment index. The Online Appendix provides detailed correlations and figures on how the multifactor indices line up with the media sentiment indices by city. One limitation of the multifactor index is the data for the five sentiment proxies require a proprietary transactions deeds records database for which data are only available for 20 out of 34 cities in my sample.6 Nonetheless, for those cities that data are available, the multifactor index lines up well with the media housing sentiment index. This correlation suggests that the media sentiment index not only captures sentiment from home buyers, but among agents across the housing market in general. Where data are available, the multifactor index also suggests to be a potential alternative index to both validate and/or provide an additional proxy for sentiment. 2. House Prices and Sentiment 2.1 Descriptive patterns I obtain quarterly residential home prices across my sample of 34 cities from the Federal Housing Finance Agency (FHFA). Like the Case-Shiller home price index, the FHFA home price index estimates price changes for single-family homes with repeat sales to control for the changing quality of houses being sold through time. Their estimates are based on data on repeat mortgage transactions that have been purchased or securitized by Fannie Mae or Freddie Mac. The FHFA indices cover additional cities in my sample beyond the twenty major cities followed by Standard & Poor’s/Case-Shiller index. Both indexes, however, are highly correlated (0.87). Figure 3 plots media sentiment and house prices for select cities in my sample. (For the full sample of 34 city plots, please see the Online Appendix.) Panel A provide plots for cities with high price appreciation from 2000 to 2006. The striking boom and bust pattern in Figure 1 is driven by cities such as Los Angeles and Las Vegas, which experienced a similar lead-lag trend in house sentiment and prices. Panel B shows that cities with minimal price appreciation such as Atlanta and Cleveland, however, appear to have more random or subdued patterns in media sentiment from 2000 to 2004. While these cities did not observe large run-ups in prices, they did experience large declines in prices that seem to be preceded by patterns of decline in media sentiment as well. As Ferreira and Gyourko (2012) also document, I observe a wide variation in house price changes across cities in my sample during this period. Likewise, I find significant variation in the timing and magnitude of appreciation of sentiment index across cities as well. Figure 3 View largeDownload slide Housing sentiment and prices This figure plots the housing sentiment and price indexes and lists the respective mean and standard deviation by city from 2000 to 2013 for select cities. Panel A provides plots for cities with high price appreciation from 2000 to 2006, and panel B shows plots for cities with relatively lower price appreciation. The solid blue line represents the media sentiment index, and the dashed red line plots the price index. Plots for the full sample of 34 cities are available in an Online Appendix. Figure 3 View largeDownload slide Housing sentiment and prices This figure plots the housing sentiment and price indexes and lists the respective mean and standard deviation by city from 2000 to 2013 for select cities. Panel A provides plots for cities with high price appreciation from 2000 to 2006, and panel B shows plots for cities with relatively lower price appreciation. The solid blue line represents the media sentiment index, and the dashed red line plots the price index. Plots for the full sample of 34 cities are available in an Online Appendix. The lead time between media sentiment and prices seems relatively large initially, particularly in comparison to the stock market where sentiment predicts prices and their reversals over just several days (Tetlock 2007, Baker and Wurgler 2006). Taken together with the timing between my media sentiment index and the multifactor index, however, this lead-lag pattern seems consistent with the transaction process and frictions in the housing market. On average, my media sentiment index peaks in mid-2004, the multifactor index peaks in mid-2005, and house prices peak on average peaks in mid-2006. Both the SOC and Case-Shiller surveys also find that expectations started declining in 2004, so the pattern in the media index appears to be consistent with survey expectations. As discussed in Section 1.4, all of the proxies in the multifactor index are transactional outcomes that would result in response to rising expectations, so we would expect sentiment to rise and fall first and then see actions such as closing on a mortgage or investing in a second home to follow two or three quarters later. House price indices are then based on publicly recorded transactions registered with the county deeds records, which are not recorded until after a transaction has finally closed. Depending on details of home inspection and financing, from first offer to final closing on a home can often take at least 1 month. The shortest average reported mortgage loan closing period reported by Ellie Mae, a technology company that tracks mortgage applications, is 37 days.7 Considering these frictions and that house price indices are constructed at a 3-month moving average, it seems reasonable that it would take another two or three quarters for these shifts in housing transactions underlying the multifactor index to show up in changes in house price index levels. 2.2 Predicting prices with sentiment Thus, the leading patterns across cities suggests the media housing sentiment has a predictive impact on prices. I explore the effect of my media measure of expectations on house prices with the following linear framework: $$\triangle p_{it+1}=\alpha+\lambda L^{k}\triangle p_{it}+\beta L^{k}\Delta s_{it}+\gamma\triangle x_{it}+\epsilon_{it+1}\label{eq:2}$$ (2) where $$i$$ denotes each city and t indicates a quarterly period. A lowercase letter represents a log operator such that ($$p_{t}=logP_{t}$$) and $$\triangle$$ denotes the first difference such that $$\Delta p_{t+1}=logP_{t+1}-logP_{t}$$. I include the same number of lags of past price changes as sentiment lags in all specifications, as denoted by $$L^{k}\triangle p_{it}$$. Predictability in house price changes has been well documented across a number of studies. Among the most well-known, Case and Shiller (1989) find significant positive serial correlation and predictability with past price growth in four markets. FHFA house price changes across my sample of 34 cities exhibit positive serial correlation with an average AR(1) coefficient equal to approximately 0.42. Vector $$\triangle x_{t}$$ includes a set of economic variables such as rents, population, income, employment, and interest rates that have been shown to predict residential house price growth over time. The “fundamental value” of an asset typically refers to its present discounted value of future cash flow, which models of housing assume housing pays dividends in the form of rental services. I obtain city-level measures of rents from REIS.com, which provides average asking rents on rental units with common characteristics with single family homes. A number of housing studies also highlight the importance of labor market variables on housing demand (Roback 1982, Rosen 1979). I attain local employment and levels and unemployment rates by city from the Bureau of Labor Statistics (BLS). The Rosen/Roback model of spatial equilibrium also suggests income as an important demand shifter. I include city measures of personal income per capita and population from the Bureau of Economic Analysis (BEA). Studies argue that low interest rates should lead to increased housing demand and higher prices (Himmelberg, Mayer, and Sinai 2005). I include the national 30-year conventional mortgage rate relevant to most home buyers, but also compute real interest rates following Himmelberg, Mayer, and Sinai (2005) by subtracting the Livingston Survey 10-year expected inflation rate from the 10-year Treasury bond rate for robustness. The Online Appendix provides the summary statistics for all variables. This set of economic “fundamentals” does an exceptional job explaining the changes in house prices prior to 2000, but has difficulty explaining the variation in the most recent cycle. The vector $$\triangle x_{t}$$ predicts nearly 70% of variation in composite house prices prior to 2000. Using the same set of economic variables to explain house price growth after 2000, however, explains less than 10% of the variation. Local economic variables explain more of the variation in prices city to city post-2000 than in the composite, but are able to explain 1.55 times more variation prior to 2000. These traditional housing factors are able to explain more variation in prices in cities that had stable house price appreciation, but account for minimal movement in cities with large swings in prices post-2000. Thus, Equation (2) considers whether a media proxy for expectations serves as a significant predictor for house prices during this period. I first normalize my index to be positive with the same adjustment as the SOC survey measure (i.e., $$pos-neg+100$$). I then orthogonalize my measure of sentiment changes, $$\triangle s_{t}$$, from the observed vector of fundamentals, $$\triangle x_{t}$$. Thus, $$\triangle s_{t}$$ represents log differences in my measure of housing media sentiment, orthogonalized to observed fundamentals, and $$L^{k}$$ is a lag operator that indicates $$k$$ number of lags such that lags $$L^{k}\Delta s_{n}t=lnS_{n,t-k}-lnS_{n,t-k-1}$$. I impose a finite-distributed lag structure with four quarter periods such that $$k=4$$.8 The parameter $$\beta$$ then captures the total accumulated predictive effect of sentiment on house prices for each individual lag $$k$$ of $$L^{k}\triangle s_{t}$$. Equation (2) tests the alternative hypothesis that $$\beta\neq0$$ against the null that $$\beta=0$$. If media sentiment simply reflects price movements or economic fundamentals already incorporated into prices, then $$\beta$$ should not be significantly different than zero. Table 3 presents the results from Equation (2). The first row reports the total accumulated effect of housing sentiment, $$\beta$$, on the current $$t$$ quarterly growth in prices. The subsequent rows breaks down the lagged effect of sentiment by each quarter. Estimates show that a 1% appreciation in four quarters of accumulated lagged sentiment is positively associated with a future quarterly price appreciation of approximately 0.93 percentage points. These coefficient estimates represent the predicted effect of housing media sentiment above and beyond both historical housing economic variables and four quarters of past prices changes. To put magnitudes into context, a one standard deviation increase in a one year accumulation of housing sentiment in Las Vegas predicts approximately 12% future quarterly price growth. The largest quarterly price growth during in my sample occurred in Las Vegas in Q2 of 2004, appreciating almost 12.5%. Table 3 Predicting house price growth with housing sentiment Dep var: House price growth, $$t$$=quarterly (1) (2) (3) (4) Sum of lagged sentiment 0.933 0.933 0.933 0.670 (7.003) (3.064) (3.643) (3.319) $$\quad$$Qtr 1 $$(\triangle s_{t-1})$$ 0.301 0.301 0.301 0.211 (6.905) (3.503) (3.792) (3.722) $$\quad$$Qtr 2 $$(\triangle s_{t-2})$$ 0.239 0.239 0.239 0.170 (5.189) (2.823) (2.947) (2.359) $$\quad$$Qtr 3 $$(\triangle s_{t-3})$$ 0.232 0.232 0.232 0.158 (4.744) (2.284) (2.833) (2.227) $$\quad$$Qtr 4 $$(\triangle s_{t-4})$$ 0.162 0.162 0.162 0.129 (4.120) (2.703) (2.934) (2.720) Lagged price changes $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged national sentiment . . . $$\checkmark$$ SE: Newey-West $$\checkmark$$ . . . SE: Driscoll-Kray . $$\checkmark$$ . . SE: double clustered by $$(i,t)$$ . . $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.38 0.38 0.38 0.44 Dep var: House price growth, $$t$$=quarterly (1) (2) (3) (4) Sum of lagged sentiment 0.933 0.933 0.933 0.670 (7.003) (3.064) (3.643) (3.319) $$\quad$$Qtr 1 $$(\triangle s_{t-1})$$ 0.301 0.301 0.301 0.211 (6.905) (3.503) (3.792) (3.722) $$\quad$$Qtr 2 $$(\triangle s_{t-2})$$ 0.239 0.239 0.239 0.170 (5.189) (2.823) (2.947) (2.359) $$\quad$$Qtr 3 $$(\triangle s_{t-3})$$ 0.232 0.232 0.232 0.158 (4.744) (2.284) (2.833) (2.227) $$\quad$$Qtr 4 $$(\triangle s_{t-4})$$ 0.162 0.162 0.162 0.129 (4.120) (2.703) (2.934) (2.720) Lagged price changes $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged national sentiment . . . $$\checkmark$$ SE: Newey-West $$\checkmark$$ . . . SE: Driscoll-Kray . $$\checkmark$$ . . SE: double clustered by $$(i,t)$$ . . $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.38 0.38 0.38 0.44 Coefficient estimates from predictive regressions of sentiment on house prices specified in Equation (2). The first row reports the accumulated impact of a lagged year of sentiment growth on future quarterly price growth. The subsequent rows breaks down this summed impact by each quarter lag. All columns control for four quarters of past price changes and a vector of housing fundamentals, $$\triangle x_{t-1}$$, described in the text. The corresponding $$t$$-statistics of each estimate are reported in parentheses based on respected calculated standard errors. (Note the first row reports $$t$$-statistics based standard errors adjusted appropriately for the summed estimates of sentiment lags.) Column 4 controls for an additional 4 quarter lags of national housing sentiment measured by the Michigan Survey of Consumers. Full regression results with all point estimates can be found in an Online Appendix. Table 3 Predicting house price growth with housing sentiment Dep var: House price growth, $$t$$=quarterly (1) (2) (3) (4) Sum of lagged sentiment 0.933 0.933 0.933 0.670 (7.003) (3.064) (3.643) (3.319) $$\quad$$Qtr 1 $$(\triangle s_{t-1})$$ 0.301 0.301 0.301 0.211 (6.905) (3.503) (3.792) (3.722) $$\quad$$Qtr 2 $$(\triangle s_{t-2})$$ 0.239 0.239 0.239 0.170 (5.189) (2.823) (2.947) (2.359) $$\quad$$Qtr 3 $$(\triangle s_{t-3})$$ 0.232 0.232 0.232 0.158 (4.744) (2.284) (2.833) (2.227) $$\quad$$Qtr 4 $$(\triangle s_{t-4})$$ 0.162 0.162 0.162 0.129 (4.120) (2.703) (2.934) (2.720) Lagged price changes $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged national sentiment . . . $$\checkmark$$ SE: Newey-West $$\checkmark$$ . . . SE: Driscoll-Kray . $$\checkmark$$ . . SE: double clustered by $$(i,t)$$ . . $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.38 0.38 0.38 0.44 Dep var: House price growth, $$t$$=quarterly (1) (2) (3) (4) Sum of lagged sentiment 0.933 0.933 0.933 0.670 (7.003) (3.064) (3.643) (3.319) $$\quad$$Qtr 1 $$(\triangle s_{t-1})$$ 0.301 0.301 0.301 0.211 (6.905) (3.503) (3.792) (3.722) $$\quad$$Qtr 2 $$(\triangle s_{t-2})$$ 0.239 0.239 0.239 0.170 (5.189) (2.823) (2.947) (2.359) $$\quad$$Qtr 3 $$(\triangle s_{t-3})$$ 0.232 0.232 0.232 0.158 (4.744) (2.284) (2.833) (2.227) $$\quad$$Qtr 4 $$(\triangle s_{t-4})$$ 0.162 0.162 0.162 0.129 (4.120) (2.703) (2.934) (2.720) Lagged price changes $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged national sentiment . . . $$\checkmark$$ SE: Newey-West $$\checkmark$$ . . . SE: Driscoll-Kray . $$\checkmark$$ . . SE: double clustered by $$(i,t)$$ . . $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.38 0.38 0.38 0.44 Coefficient estimates from predictive regressions of sentiment on house prices specified in Equation (2). The first row reports the accumulated impact of a lagged year of sentiment growth on future quarterly price growth. The subsequent rows breaks down this summed impact by each quarter lag. All columns control for four quarters of past price changes and a vector of housing fundamentals, $$\triangle x_{t-1}$$, described in the text. The corresponding $$t$$-statistics of each estimate are reported in parentheses based on respected calculated standard errors. (Note the first row reports $$t$$-statistics based standard errors adjusted appropriately for the summed estimates of sentiment lags.) Column 4 controls for an additional 4 quarter lags of national housing sentiment measured by the Michigan Survey of Consumers. Full regression results with all point estimates can be found in an Online Appendix. Columns 1 through 3 of Table 3 compares t-statistics calculated based on 3 different standard error procedures to test the stability of coefficient significance. In Column 1, I assume the error term $$\epsilon_{t+1}$$ is heteroscedastic across time and serially correlated within city, and calculate panel Newey and West (1987) standard errors that are robust to heteroscedasticity and auto-correlation up to twelve lags. This column assumes errors are correlated within city since studies have documented little mobility in homeowners across states. Nonetheless, the presence of spatial correlation across my measures could severely understate calculated standard errors (Foote 2007). To address potential cross-sectional spatial dependence, I calculate Driscoll and Kraay (1998) standard errors for robustness. I find this does increase my standard errors as indicated by the lower t-statistics reported in parentheses, suggesting some dependence exists across cities. Nonetheless, my estimates remain statistically significant. In Column 3, I allow for further flexibility in the structure of this dependence and cluster my standard errors by both time and city. Doing so results in slightly higher t-statistics, and estimates remain statistically significant. The first three columns of Table 3 show that past local media sentiment has a positive and significant association with future quarterly house price growth above and beyond past local returns and local economic fundamentals. While house price cycles have been predominantly local in the past, the most recent cycle was marked by an unprecedented number of geographically separate cities that appeared to experience boom and bust trends at similar times, suggesting a potential national component across markets. Thus in addition to local media sentiment, Column 4 includes the SOC Housing Survey as a proxy for nationwide housing sentiment. The SOC Housing Survey does have significant predictive power for local returns as well, though at a smaller magnitude to the local index.9 Including national sentiment drops the magnitude of the local media sentiment index to 0.67, suggesting that a national systematic component accounts for approximately 30% of the local sentiment index. Thus while there is a strong local relationship between returns and sentiment, there is a notable systematic component across local housing sentiment as well. These results are consistent with the puzzle of simultaneous common yet heterogeneous movement observed across local housing markets during the last housing cycle. While an unprecedented number of local housing markets appeared to boom and bust at similar times, there were still many cities that had markets that did not boom at all, and yet looked similar on fundamentals to those that did. This puzzle points to why it is difficult for a national factor, such as the mortgage rate, to explain the breadth of price differences across local housing markets. Glaeser, Gottlieb, and Gyourko (2010), for example, found that mortgage rates and available credit can explain up to one-fifth of price appreciation across cities. Nonetheless, these results suggest that there is evidence of a systematic factor that plays a significant role across sentiment and prices. The following section further explores potential common determinants in sentiment across cities. 2.3 Determinants of sentiment The results thus far show that expectations, as proxied by the media sentiment, are positively associated with future house prices. At the same time, expectations may also be influenced by past price changes. In their first survey of home buyer expectations, Case and Shiller (1988) concluded “people seem to form their expectations from past price movements rather than knowledge of fundamentals.” In their updated surveys, Case, Shiller, and Thompson (2012) find that home buyers’ expectations appeared extremely high, projecting an appreciation of more than 10% per year for the next 10 years. While these expectations at first seem wildly unrealistic, Case, Shiller, and Thompson (2012) note that the Case-Shiller composite-10 index had indeed appreciated nearly 11% per year over the last ten years from 1996 to 2006. Greenwood and Shleifer (2014) examine six different surveys of investor expectations on the stock market, and find that investor expectations do appear to extrapolate from past stock market returns. I explore the nature of my measure of housing expectations with the following analogous linear specification in their study: $$S_{it}=\alpha+\lambda R_{it-k}+\delta P_{it}+\gamma logX_{it}+city_{i}+u_{it}\label{eq:3}$$ (3) where $$S_{it}$$ denotes the level of housing expectations in city $$i$$ at quarter $$t$$, and $$R_{it-k}$$ represents the cumulative return in city $$i$$’s local housing market from period $$k$$ to $$t$$. I control for both price-to-rent and price-to-income ratios, as denoted by $$P_{it}$$ and $$X_{it}$$ is the same vector of economic variables from Equation (2). I report results based on standard errors double clustered by time and city. Table 4 reports the estimates from Equation (3) for four different time horizons. Columns 1 through 4 regress the housing media sentiment index on 6-month, 1-year, 3-year, and 5-year cumulative price growth, respectively. The results show that past price appreciation predicts higher media sentiment, consistent with home buyer expectations in Case and Shiller (1988) and Case, Shiller, and Thompson (2012) survey responses. Column 1 shows that the coefficient on a 6-month lagged cumulative return is equal to approximately 7.3. This implies that if returns to housing in the last 6 months increase by one standard deviation (approximately 5.5% across my sample period), then the housing media sentiment index will increase by approximately 0.5 units. Given that the media sentiment index ranges from approximately $$-3.5$$ to 10.6, the impact of 0.5 units is sizeable and significant in magnitude. Table 4 Predicting housing sentiment with past returns Dep var: Media sentiment $$S_{it}$$, $$t$$=quarterly 6 months 1 year 3 years 5 years (1) (2) (3) (4) Cumulative price appreciation 7.310 5.362 1.404 –0.331 (3.85) (4.25) (1.75) (–0.53) Price/rent 10.956 9.204 8.057 14.633 (2.96) (2.55) (1.63) (2.15) 30-year mortgage rate –1.123 –1.029 –1.309 –1.472 (–5.25) (–5.06) (–6.17) (–8.07) City FEs $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,852 1,852 1,852 1,852 Adjusted $$R^{2}$$ 0.312 0.319 0.292 0.285 Dep var: Media sentiment $$S_{it}$$, $$t$$=quarterly 6 months 1 year 3 years 5 years (1) (2) (3) (4) Cumulative price appreciation 7.310 5.362 1.404 –0.331 (3.85) (4.25) (1.75) (–0.53) Price/rent 10.956 9.204 8.057 14.633 (2.96) (2.55) (1.63) (2.15) 30-year mortgage rate –1.123 –1.029 –1.309 –1.472 (–5.25) (–5.06) (–6.17) (–8.07) City FEs $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,852 1,852 1,852 1,852 Adjusted $$R^{2}$$ 0.312 0.319 0.292 0.285 Coefficient estimates from Equation (3), regressing sentiment levels on past cumulative returns, price-to-rent, price-to-income, and all fundamentals. The table only reports price-to-rent and mortgage rate controls, all other fundamentals are insignificant and of negligible magnitude. All inference is based on double-clustered standard errors by quarter and city, and corresponding $$t$$-statistics are reported in parentheses. Estimates show that when past returns increase, housing sentiment also rises. This effect is robust up to a window of 3 years, but the magnitude of the effect appears to decline with a longer horizon of 5 years. Table 4 Predicting housing sentiment with past returns Dep var: Media sentiment $$S_{it}$$, $$t$$=quarterly 6 months 1 year 3 years 5 years (1) (2) (3) (4) Cumulative price appreciation 7.310 5.362 1.404 –0.331 (3.85) (4.25) (1.75) (–0.53) Price/rent 10.956 9.204 8.057 14.633 (2.96) (2.55) (1.63) (2.15) 30-year mortgage rate –1.123 –1.029 –1.309 –1.472 (–5.25) (–5.06) (–6.17) (–8.07) City FEs $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,852 1,852 1,852 1,852 Adjusted $$R^{2}$$ 0.312 0.319 0.292 0.285 Dep var: Media sentiment $$S_{it}$$, $$t$$=quarterly 6 months 1 year 3 years 5 years (1) (2) (3) (4) Cumulative price appreciation 7.310 5.362 1.404 –0.331 (3.85) (4.25) (1.75) (–0.53) Price/rent 10.956 9.204 8.057 14.633 (2.96) (2.55) (1.63) (2.15) 30-year mortgage rate –1.123 –1.029 –1.309 –1.472 (–5.25) (–5.06) (–6.17) (–8.07) City FEs $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,852 1,852 1,852 1,852 Adjusted $$R^{2}$$ 0.312 0.319 0.292 0.285 Coefficient estimates from Equation (3), regressing sentiment levels on past cumulative returns, price-to-rent, price-to-income, and all fundamentals. The table only reports price-to-rent and mortgage rate controls, all other fundamentals are insignificant and of negligible magnitude. All inference is based on double-clustered standard errors by quarter and city, and corresponding $$t$$-statistics are reported in parentheses. Estimates show that when past returns increase, housing sentiment also rises. This effect is robust up to a window of 3 years, but the magnitude of the effect appears to decline with a longer horizon of 5 years. The most recent returns appear to have the greatest impact, though past returns have a positive and significant impact across all given columns. The magnitude of the estimated effects of lagged returns decline with further distance, where the coefficient on the one-year cumulative returns is equal to 5.3 down to 1.4 for 3 year returns. Extending the window of returns to just 5 years in Equation (3), results in estimates that are negative and not statistically significant. The effect of lagged returns on media sentiment only holds for the most recent returns. These findings suggest that housing expectations, as proxied by the media, possess a backward-looking nature consistent with theories of extrapolative expectations in behavioral finance (Barberis, Shleifer, and Vishny 1998, Campbell and Kyle 1993, Delong et al. 1990b). These results are also consistent with the structure of investor expectations Greenwood and Shleifer (2014) find in the stock market. Price to rent ratios are positively correlated with sentiment levels, but loses significance and magnitude beyond a window of 1 year. Further including price to income ratios has a null and insignificant effect. Note that all columns also include city fixed effects. Without city fixed effects, the predictive effect of past returns on sentiment is slightly higher, suggesting the effect of returns on sentiment not only comes within city variation but also differences in sentiment levels city to city. I include all housing fundamentals from Equation (2), but only the housing mortgage rate has any effect of notable significance. Table 4 reports the coefficient estimates on the 30-year mortgage rate, and shows that the mortgage rate is negatively associated with media sentiment. As the mortgage rate declines, housing media sentiment increases. This is consistent with responses in the SOC that answered optimistically about housing because mortgage rates were low. As Piazzesi and Schneider (2009) point out, interest rates have historically been a major driver of housing perception and these estimates suggest that mortgage rates continued to have an impact on public perception during this sample period as well. Nominal mortgage interest rates reached its lowest point in mid-2004, coinciding with the peak of many of the local sentiment indices. Real interest rates, however, actually hit the lowest point later in 2005. This common timing of the sentiment index with nominal, rather than real, interest rates is consistent with the behavioral bias of money illusion, that is, the tendency of individuals to only think in nominal terms. The results in Table 4 suggest that (nominal) mortgage interest rates are an important common determinant in housing media sentiment across cities, consistent with this money illusion bias. 3. Interpretation The results in Section 2 show that housing sentiment, as captured by the media, is positively associated with future house price growth. This predictive effect is robust to known fundamental controls that have explained house prices well historically. The structure of the housing sentiment index appears to be extrapolative in nature, and peaks more than two-and-a-half years ahead of house prices on average. One interpretation of these results is that they capture the effects of sentiment, in which investor beliefs were over-optimistic and drove house prices away from fundamentals. The media sentiment index certainly does mirror patterns in the Case, Shiller, and Thompson (2012) survey, where home buyer expectations look unjustifiably high and similarly peak in 2004. At the same time, however, these results could also be consistent with a story of unobserved fundamentals that were instead driving price growth. Housing markets are inherently local, and local media in particular could convey information on local fundamentals that are otherwise difficult to observe or collect. Another possibility then is that the media sentiment index represents a valuable source of unobserved information about the local housing market. Reviewing the patterns in Figure 2, particularly in markets with a defined boom and bust of prices, one could argue that the media proxy even looks like a perfect forecast, perhaps an indication expectations reflected justified local information. The goal in this section is to provide a set of tests to explore these two interpretations empirically. The advantage of measuring housing sentiment with the news media is that we can exploit the richness of the data both across text and cross-sectional city indices. We can then use this analysis to test results that we expect to be consistent with an interpretation of sentiment versus a story of information on unobserved fundamentals. 3.1 Testing the news content over fundamentals I first address the interpretation of the media sentiment index as a source of unmeasured fundamentals by examining the informational content of the articles firsthand. By keeping track of all of the text across articles, I can directly examine whether my media sentiment index is driven by articles discussing particular housing variables. Following an analogous strategy from Tetlock, Saar-Tsechansky, and Macskassy (2008), I flag any news article that mentions stem words such as “rents,” “population,” or “taxes” that may indicate discussions of local housing market information. I then quantify the fraction of positive and negative words across these articles to create a set of “media fundamentals” analogous to my overall housing media sentiment index. For example, in Table 4, “media rents” is an index of the positive and negative words across all local articles that discuss rents, “media user costs” refers to words across any articles that discuss factors enter into the user cost of housing such as property taxes and maintenance costs, and “media demographics” indicate any articles that discuss local population and income. Through this strategy, the news media index also presents a potential opportunity to quantify particular information in markets where fundamentals are difficult to observe. I then additionally control for these media fundamentals in Equation (2) to see if any of these measures reduce my estimated effect of expectations. If the discussion of fundamentals from these articles drive the patterns in my main housing media sentiment index, then controlling for words in these articles should drive down the significance and magnitude of the results in Table 3. I control for two quarters of lags of each media fundamental as well as all observed fundamental controls from Equation (2). Table 5 shows that the estimated effects of media sentiment on price growth remain robust to controlling for news content over fundamentals. Columns 1 through 7 add each media control sequentially to test the stability of the coefficient estimate for the overall sentiment index, $$\beta.$$ Coefficient estimates of $$\beta$$ remain remarkably robust with the inclusion of each additional control and decline neither in significance nor magnitude. As argued by a number of previous studies, the stability of estimates to the sequential addition of controls suggests bias from unobserved factors is less likely (Altonji et al. 2005, Angrist and Krueger 1999). Table 5 Is the housing sentiment index driven by news stories on fundamentals? Dep var: Housing price growth, $$t$$=quarterly (1) (2) (3) (4) (5) (6) (7) Sum of lagged sentiment 0.929 0.939 0.940 0.943 0.924 0.891 0.836 (3.636) (3.735) (3.715) (3.711) (3.440) (3.283) (3.061) Media rents $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media labor market conditions . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media housing supply . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media user costs . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media demographics . . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media local GDP & inflation . . . . . $$\checkmark$$ $$\checkmark$$ Media credit conditions . . . . . . $$\checkmark$$ Lagged prices $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.379 0.380 0.380 0.381 0.383 0.385 0.387 Dep var: Housing price growth, $$t$$=quarterly (1) (2) (3) (4) (5) (6) (7) Sum of lagged sentiment 0.929 0.939 0.940 0.943 0.924 0.891 0.836 (3.636) (3.735) (3.715) (3.711) (3.440) (3.283) (3.061) Media rents $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media labor market conditions . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media housing supply . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media user costs . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media demographics . . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media local GDP & inflation . . . . . $$\checkmark$$ $$\checkmark$$ Media credit conditions . . . . . . $$\checkmark$$ Lagged prices $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.379 0.380 0.380 0.381 0.383 0.385 0.387 Coefficient estimates for sentiment from Equation (2), with additional controls for news content over fundamentals. Each media fundamental listed identifes any news article that mentions a particular fundamental in its text. The variable “media rents”, for example, is the share of positive minus negative words in any real estate articles that mention any word related to “rents” in its text. Standard errors are double-clustered by quarter and city, and corresponding $$t$$-statistics are reported in parentheses. Estimates of lagged media represent the impact of a 1 $$\%$$ increase in the accumulated past year growth of sentiment on the quarterly growth in house prices. Estimates remain robust to the inclusion of “media fundamentals,” suggesting that the estimated effect is not driven by a particular set of stories on fundamentals. Table 5 Is the housing sentiment index driven by news stories on fundamentals? Dep var: Housing price growth, $$t$$=quarterly (1) (2) (3) (4) (5) (6) (7) Sum of lagged sentiment 0.929 0.939 0.940 0.943 0.924 0.891 0.836 (3.636) (3.735) (3.715) (3.711) (3.440) (3.283) (3.061) Media rents $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media labor market conditions . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media housing supply . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media user costs . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media demographics . . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media local GDP & inflation . . . . . $$\checkmark$$ $$\checkmark$$ Media credit conditions . . . . . . $$\checkmark$$ Lagged prices $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.379 0.380 0.380 0.381 0.383 0.385 0.387 Dep var: Housing price growth, $$t$$=quarterly (1) (2) (3) (4) (5) (6) (7) Sum of lagged sentiment 0.929 0.939 0.940 0.943 0.924 0.891 0.836 (3.636) (3.735) (3.715) (3.711) (3.440) (3.283) (3.061) Media rents $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media labor market conditions . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media housing supply . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media user costs . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media demographics . . . . $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Media local GDP & inflation . . . . . $$\checkmark$$ $$\checkmark$$ Media credit conditions . . . . . . $$\checkmark$$ Lagged prices $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,676 1,676 1,676 1,676 1,676 1,676 1,676 Adjusted $$R^{2}$$ 0.379 0.380 0.380 0.381 0.383 0.385 0.387 Coefficient estimates for sentiment from Equation (2), with additional controls for news content over fundamentals. Each media fundamental listed identifes any news article that mentions a particular fundamental in its text. The variable “media rents”, for example, is the share of positive minus negative words in any real estate articles that mention any word related to “rents” in its text. Standard errors are double-clustered by quarter and city, and corresponding $$t$$-statistics are reported in parentheses. Estimates of lagged media represent the impact of a 1 $$\%$$ increase in the accumulated past year growth of sentiment on the quarterly growth in house prices. Estimates remain robust to the inclusion of “media fundamentals,” suggesting that the estimated effect is not driven by a particular set of stories on fundamentals. The estimated effect of housing media sentiment remains significant, positive, and large in magnitude after accounting for an extensive set of fundamental articles. The only media fundamental to reduce the magnitude slightly is the fraction of positive and negative words across articles discussing credit conditions. While not a typical housing fundamental historically, an additional new factor largely debated during the housing crisis was the availability of easy credit. Mian and Sufi (2011) show that the extraordinary rise in house prices from 2000 to 2005 was also accompanied by an unprecedented expansion of mortgage credit, particularly in the subprime market (Mian and Sufi 2009, Glaeser, Gottlieb, and Gyourko 2010). Easing lending standards and rising approval rates opened home-buying to a new set of consumers, which potentially allowed a new group of homebuyers to shift aggregate demand and drive up house price growth (Keys, Seru, and Vig 2012, Mian and Sufi 2011). Including additional controls for subprime lending and easy credit, however, does not diminish the predictive effect of the media index on prices.10 3.2 Cross-sectional effects of housing media sentiment The preceding sections take advantage of the textual nature of the data, which allows for a direct analysis of the content across articles. Measuring sentiment with news media further provides an avenue to create indices across several markets in real time, creating an opportunity to explore sentiment effects cross-sectionally. Given the city-wide variation in house price changes during this period, this is particularly useful when exploring the relationship between sentiment and prices in the housing market. With access to city-specific indices, we can explore whether prices and sentiment vary systematically according to any city-level traits. The cross-sectional nature of the data thus allows us to test whether cross-sectional differences exist based on characteristics that we would expect to be consistent with a theory of sentiment. 3.2.1 Minority home buyers Baker and Wurgler (2006) highlight two channels through which theory predicts sentiment has cross-sectional effects on prices: (1) where demand is less informed and (2) where arbitrage constraints are particularly binding. Since arbitrage constraints are completely binding in the housing market, this suggests potential cross-sectional effects lie in differences in informed demand for housing among buyers. For example, Shiller (2008) raises concerns to the disproportionate lack of access to adequate financial advice available to minority groups that may lead to investment decisions based on minimal or biased information. Indeed in a comprehensive survey of financial literacy, Lusardi and Mitchell (2007) find that financial knowledge and planning are at lowest levels among Hispanic and Black respondents. The Home Mortgage Disclosure Act (HMDA) requires lending institutions to file reports on all mortgage applications, and thus provides an opportunity to test cross-sectional effects based on the demographic profile of the pool of potential home buyers. Following Ferreira and Gyourko (2012), I define a “% minority” variable based on the fraction of African-Americans and Hispanics loan applicants coded by the HMDA data set. I calculate the average fraction of minority loan applicants across the sample period for each city, and then divide the 34 cities in my sample into two equal groups based on whether the city contains a low or high fraction of potential minority homebuyers. I then estimate the following equation to test whether sentiment effects differ across group: \begin{align} \triangle p_{it+1}&{\ =\ }\alpha+\lambda L^{k}\triangle p_{it}+\beta L^{k}\Delta s_{it}+High+\beta_{High}L^{k}\Delta s_{it}\times High\nonumber\\ &\quad +\,\gamma\triangle x_{it}+\epsilon_{it+1} \label{eq:5} \end{align} (4) Equation (4) examines the same predictive relationship between prices and media sentiment like in Equation (2), but now explores additional interactions between the fraction of potential minority homebuyers and lagged sentiment. $$High$$ is an indicator variable for a city with a high fraction of minority buyers, and is interacted with all included lags of media sentiment. Thus, $$\beta$$ now represents the baseline effect of sentiment for cities with the lowest fraction of minority homebuyers. The parameter $$\beta_{High}$$ then captures the additional sentiment effect of being in the high group. If there are no cross-sectional differences across cities, the coefficient $$\beta_{High}$$ should not be significantly different than zero. If we presume the pool of buyers with low access to financial advice to be more likely subject to sentiment, then we would expect sentiment effects to be larger in cities with a larger fraction of minority buyers. In other words, a significant coefficient $$\beta_{High}>0$$ would indicate a cross-sectional effect consistent with a theory of sentiment. Column 1 of Table 6 reports the coefficient estimates $$\beta$$ in the first row and $$\beta_{High}$$ in the row directly below. The results reveal that sentiment effects do appear to be greater in cities with a higher fraction of minority loan applicants. Baseline estimates for the low group indicate a 1% appreciation in accumulated lagged sentiment predicts a future quarterly price appreciation of approximately 0.6 percentage points. This is still significant in magnitude relative to the average quarterly house price appreciation (0.8). Estimates for $$\beta_{High}$$, however, suggest that a the same increase in sentiment would lead to a 1.3 percentage point additional increase in future price growth. Thus, this means that sentiment in cities with a larger fraction of minority mortgage applicants has an impact more than 2x greater than in those with fewer potential minority home buyers. In their analysis, Ferreira and Gyourko (2012) find that the share of minority home buyers are not able to explain the timing or magnitude of the house price booms across cities. These results suggest that while the fraction of minority purchasers may not be able to account for the changes in house prices alone, it may still have served as a factor in combination with changing sentiment during this period. Table 6 Cross-sectional effects of housing sentiment Low price homes High price homes (1) (2) (3) (4) (5) Sum of lagged media sentiment 0.582 0.578 0.850 1.394 0.806 (2.280) (1.928) (1.297) (4.903) (3.919) $$\quad$$$$\times$$$$High$$ minority buyers 1.315 . . . . (3.748) . . . . $$\quad$$$$\times$$$$High$$ 2nd home buyers . 1.848 . . . . (4.183) . . . $$\quad$$$$\times$$$$High$$ subprime loans . . 1.414 . . . . (2.176) . . Lagged prices ($$\triangle p_{t-1...t-4}$$) $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$(\triangle x_{it-1})$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,450 1,450 754 824 824 Adjusted $$R^{2}$$ 0.392 0.396 0.424 0.633 0.633 Low price homes High price homes (1) (2) (3) (4) (5) Sum of lagged media sentiment 0.582 0.578 0.850 1.394 0.806 (2.280) (1.928) (1.297) (4.903) (3.919) $$\quad$$$$\times$$$$High$$ minority buyers 1.315 . . . . (3.748) . . . . $$\quad$$$$\times$$$$High$$ 2nd home buyers . 1.848 . . . . (4.183) . . . $$\quad$$$$\times$$$$High$$ subprime loans . . 1.414 . . . . (2.176) . . Lagged prices ($$\triangle p_{t-1...t-4}$$) $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$(\triangle x_{it-1})$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,450 1,450 754 824 824 Adjusted $$R^{2}$$ 0.392 0.396 0.424 0.633 0.633 Columns 1-3 report the for $$\beta$$ and $$\beta^{High}$$ estimates in Equation (4), exploring cross-sectional effects of sentiment on house price changes. Each coefficient estimate measures the accumulated impact of a lagged year of sentiment growth on future quarterly price growth. The baseline estimate for $$\beta$$ represents the effect of the $$Low$$ group described in the text, and estimates $$\beta^{High}$$ represent the additional effect of sentiment on prices for the corresponding $$High$$ group. Columns 4 and 5 estimate the effect of sentiment from Equation (2) on low versus high-tier housing markets as measured by the Case-Shiller house price index. Note that data coverage for estimations in Columns 3-5 tracks fewer cities, and thus the number of observations in these columns are smaller. All columns control for past price changes and the same vector of housing fundamentals in all previous tables. All inference is based on double-clustered standard errors by quarter and city, and the corresponding $$t$$-statistics are reported in parentheses. Results show that estimated sentiment effects are greater across groups with investors that are likely more subject to housing sentiment. Table 6 Cross-sectional effects of housing sentiment Low price homes High price homes (1) (2) (3) (4) (5) Sum of lagged media sentiment 0.582 0.578 0.850 1.394 0.806 (2.280) (1.928) (1.297) (4.903) (3.919) $$\quad$$$$\times$$$$High$$ minority buyers 1.315 . . . . (3.748) . . . . $$\quad$$$$\times$$$$High$$ 2nd home buyers . 1.848 . . . . (4.183) . . . $$\quad$$$$\times$$$$High$$ subprime loans . . 1.414 . . . . (2.176) . . Lagged prices ($$\triangle p_{t-1...t-4}$$) $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$(\triangle x_{it-1})$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,450 1,450 754 824 824 Adjusted $$R^{2}$$ 0.392 0.396 0.424 0.633 0.633 Low price homes High price homes (1) (2) (3) (4) (5) Sum of lagged media sentiment 0.582 0.578 0.850 1.394 0.806 (2.280) (1.928) (1.297) (4.903) (3.919) $$\quad$$$$\times$$$$High$$ minority buyers 1.315 . . . . (3.748) . . . . $$\quad$$$$\times$$$$High$$ 2nd home buyers . 1.848 . . . . (4.183) . . . $$\quad$$$$\times$$$$High$$ subprime loans . . 1.414 . . . . (2.176) . . Lagged prices ($$\triangle p_{t-1...t-4}$$) $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Lagged fundamentals $$(\triangle x_{it-1})$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ SE: double clustered by $$(i,t)$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ $$\checkmark$$ Observations 1,450 1,450 754 824 824 Adjusted $$R^{2}$$ 0.392 0.396 0.424 0.633 0.633 Columns 1-3 report the for $$\beta$$ and $$\beta^{High}$$ estimates in Equation (4), exploring cross-sectional effects of sentiment on house price changes. Each coefficient estimate measures the accumulated impact of a lagged year of sentiment growth on future quarterly price growth. The baseline estimate for $$\beta$$ represents the effect of the $$Low$$ group described in the text, and estimates $$\beta^{High}$$ represent the additional effect of sentiment on prices for the corresponding $$High$$ group. Columns 4 and 5 estimate the effect of sentiment from Equation (2) on low versus high-tier housing markets as measured by the Case-Shiller house price index. Note that data coverage for estimations in Columns 3-5 tracks fewer cities, and thus the number of observations in these columns are smaller. All columns control for past price changes and the same vector of housing fundamentals in all previous tables. All inference is based on double-clustered standard errors by quarter and city, and the corresponding $$t$$-statistics are reported in parentheses. Results show that estimated sentiment effects are greater across groups with investors that are likely more subject to housing sentiment. 3.2.2 Speculators Piazzesi and Schneider (2009) analyze housing answers from the Michigan Survey of Consumers and identify a set of homebuyers as “momentum” traders, or those that invested in housing because of an increased optimistic belief in rising house prices. Indeed, Delong et al. (1990a) show that traders may rationally respond to growing sentiment in the market and buy in anticipation of increasing prices. Chinco and Mayer (2016) document the presence of second home buyers across local housing markets from 2000 to 2007, and find that nonlocal investors not only behave as speculators but also do not appear particularly well informed. Bayer, Gelssler, and Roberts (2011) find that speculative activity, particularly among naive investors, increased during the housing boom. Regardless of whether these investors were rationally chasing or subject to sentiment themselves, these studies present an additional opportunity to test cross-sectional effects of sentiment across markets. Given a measure of speculators, evidence suggests that we should observe a larger impact of sentiment in markets where there is greater speculation. Utilizing the transaction-level records in the DataQuick data set, I define the number of speculators in a city by those who purchased one or more homes that are not owner-occupied.11 I again divide the cities my sample into two equal groups, but now based on the presence of speculators across markets. $$Low$$ thus refers to markets with a low number of speculators and $$High$$ is an indicator variable for cities with an above median number of speculators. I then run the same regression in Equation (4), but test for cross-sectional differences based on the number of speculators across markets. Consequently, $$\beta$$ now represents the baseline effect of sentiment for cities with low or no speculators, and $$\beta_{High}$$ now represents the additional sentiment effect of having a high presence of speculators. As with the test across minority homebuyers, if there exist no cross-sectional differences across cities, $$\beta_{High}$$ should not be significantly different than zero. If $$\beta_{High}$$ is significantly greater than zero, then this would indicate that the impact of sentiment is greater in cities with greater speculative investors, consistent with a theory of sentiment. Column 2 of Table 6 reports the results of estimating Equation (4) based on the cross-sectional differences in speculators. The results show that sentiment still has a positive and significant predictive effect on prices in cities with fewer speculators. Coefficient estimates for the $$Low$$ speculators group again show that a 1% appreciation in accumulated lagged sentiment predicts a future quarterly price appreciation of approximately 0.6 percentage points, similar to that of the low minority group. Coefficient estimates for $$\beta_{High}$$, however, suggest that a 1% increase in sentiment would lead to an additional 1.9 percentage point increase in future price growth. The magnitude of this estimate reveals a greater difference between the low and high groups, nearly twice the size of the estimates for the high minority group reported in Column 1. Cities with a greater number of speculators observe a sentiment impact more than double that in those with fewer speculative investors. Thus, as with the share of minorities, estimates again reveal results consistent with a theory of sentiment. 3.2.3 Subprime loans While the trend in subprime lending does rise and fall with prices in some cities, I find that its pattern does not drive out the impact of sentiment on house prices. This result is consistent with studies such as Glaeser, Gottlieb, and Gyourko (2010) that find while leverage expanded significantly, credit variables are unable to explain the substantial rise in house prices. Ferreira and Gyourko (2012) suggest that while their results also cannot explain the timing of the housing boom with patterns of subprime lending, it may still have played an important role in the later perpetuation of the housing boom. One possibility is that subprime lending served as a factor in combination with changing sentiment during this period. Mian and Sufi (2009), for example, find evidence that the increasing supply of credit from lenders led to the expansion of leverage taken out by subprime borrowers. If subprime lenders did indeed target the profile of certain borrowers, they likely targeted those buyers that were most subject to sentiment. We would simultaneously expect those most uninformed and vulnerable to sentiment to be more willing to take out risky, subprime loans. If this is the case, then this suggests an additional cross-sectional test: we should observe a larger impact of sentiment on prices in markets where more subprime loans were taken out. I divide markets into equal $$Low$$ and $$High$$ groups now based on the average number of subprime loans made each quarter. Following Ferreira and Gyourko (2012), I consider a loan subprime if issued by any of the top twenty subprime lenders ranked by the publication Inside Mortgage Finance. I again estimate the same Equation (4), but now test for cross-sectional differences based on subprime lending across markets. Similar to the two previous tests, this estimation tests the alternative hypothesis $$\beta_{High}>0$$, indicating that sentiment effects on prices are greater in cities with greater subprime lending. Results in support of the null hypothesis, $$\beta_{High}=0$$, would instead indicates that there is no discernible cross-sectional difference across markets based on the frequency of subprime lending. Column 3 of Table 6 reports the cross-sectional results based on the differences across markets in subprime loans. Baseline estimates for sentiment in the $$Low$$ group are still positive and statistically significant, and just slightly lower in magnitude to baseline estimates of sentiment in Table 2. Estimates for $$\beta_{High}$$ in Column 3, however, indicate that effect of sentiment is even greater in magnitude across markets with more subprime lending. A 1% increase in sentiment has an additional 1.4 percentage point impact on price appreciation in cities with high subprime lending. Like in Column 2, these estimates show a distinct difference between the $$Low$$ and $$High$$ groups based on subprime loans. Consistent with a sentiment hypothesis, cities with greater subprime lending experience a greater effect of sentiment on house price changes. 3.2.4 Price of homes The tests in this section thus far have taken advantage of the cross-sectional differences in housing demand based across cities to tests predicted differences in sentiment. Nonetheless, since there exists different types of buyers within markets as well, we can also exploit cross-sectional differences in informed demand within a city. If the above results are evidence of an effect of sentiment, then varying levels of informed agents within local markets should imply varying impacts of sentiment as well. One way to identify different type of agents within a local city is through the separate markets they participate in. The most obvious and straightforward way to distinguish these markets is through price. Buyers with lower incomes would only qualify for lower-priced homes, while higher-income buyers would be more active in higher-priced homes. Since minority households are often in the lower quartiles of the income distribution, Lusardi and Mitchell (2007) find that lower-income profiles similarly report lower financial literacy levels. Thus, like in the other tests in this section, a theory of sentiment would suggest that prices in markets dominated by buyers with less financial education should be more greatly affected by sentiment than those among a higher income profile. FHFA house prices used in the estimations thus far only provide an index of the on average change in all house price levels for a metropolitan area. The Case-Shiller home price indices, however, also divide their indices into three price tiers: low, medium, and high. The tiers are calculated to be comparable across metro areas, so that a low-tier reflects the bottom third of sale prices while the high tier indicates sales in the top third of home prices. Thus in Columns 4 and 5 of Table 6, I again re-estimate the main specification in Equation (2) but replace overall prices with first low-tier and then high-tier prices across metropolitan areas respectively. Note that since the Case-Shiller home price index only tracks house prices for 20 metropolitan areas, estimations in Columns 4 and 5 are limited to this sample. Columns 4 and 5 show that estimated effects of sentiment are positive and significant in magnitude relative to the average quarterly price growth in both low and high-priced homes. Column 4 reveals the total estimated effect of sentiment on the low-tier priced homes, however, is much higher relative to high-priced homes across cities. Estimates are not only statistically significant but very large in magnitude, reporting nearly 1.4 percentage point response in quarterly price growth to past changes in sentiment. In contrast, Column 5 shows that estimates for high priced homes are smaller, reporting results of nearly half the magnitude of those for lower-priced homes ($$\beta=0.81$$). Therefore, these estimates confirm that results are not only consistent with predictive effects of sentiment cross-sectionally across cities, but among market segments within cities as well. 4. Conclusion While there has been much discussion and interest in the role of “animal spirits” in the most recent housing crisis, empirical tests of this hypothesis have been limited due to the lack of sentiment measures for the housing market. Any measure of expectations is naturally difficult to construct, and survey measures are expensive to implement and therefore limited in geographic scope and/or frequency. Housing markets are driven by local factors, however, and understanding why some markets experienced big price movements and others did not in the last housing cycle subsequently requires variables with cross-sectional variation. This paper contributes the first real-time measures of local housing sentiment across 34 major metropolitan markets by quantifying the tone of local housing news. Specifically, I capture the share of positive minus negative words across local housing newspaper articles. I find that patterns in my measure of media housing sentiment appear to lead movements in house prices. In cities with big boom and busts of house prices, I find that media sentiment peaks in approximately 2004 while house prices peak in 2006. This leading pattern is also reflected in the Michigan Survey of Consumers (national) measure of housing confidence, which peaks slightly ahead of my composite-20 media housing sentiment index in 2004. I am also able to validate four of my local city sentiment measures against surveys of housing expectations from Case, Shiller, and Thompson (2012). Though only available annually, Case, Shiller, and Thompson (2012) measures exhibit similar leading patterns across cities and correlate highly with my media sentiment indices. In some ways, this pattern seems to contradict the perception that buyers were positive up through 2006 before prices began to fall. Articles in my media sentiment index are still positive through 2006, however, but reach its positive peak in 2004. Similarly, in both the Michigan SOC and Case, Shiller, and Thompson (2012), respondents are still very positive up to 2006, but their expectations are at their highest in 2004. I find that changes in my measure of media sentiment have significant predictive power for future house price growth. The media housing sentiment index explains a significant amount of variation in house price changes above and beyond a set of observed economic factors that have been shown to predict house prices historically. These traditional factors appear to explain more variation in cities with more stable house price appreciation, while media sentiment accounts for more of the variation in cities with large swings in house prices. This effect remains robust to the inclusion of additional controls for subprime lending and the availability of easy credit. The structure of the media sentiment index itself reflects a backward-looking structure consistent with extrapolative expectations proposed in behavioral finance theories. These results are consistent with two interpretations of the housing media sentiment index. The housing media index could proxy for investor sentiment in the housing market, or instead proxy for difficult-to-measure fundamentals that are instead driving housing prices. Note that regardless of interpretation, the housing media sentiment index provides an useful methodology to measure unobservable factors in the housing market. Nonetheless, the effect of the media sentiment index on house prices does not appear to be driven by news stories that discuss housing fundamentals. Media sentiment also has a greater effect on house prices among those markets with a larger minority homebuyer presence, more speculative investors, and across lower priced homes. The predictive effect is also amplified in markets where more subprime loans were approved and taken out. I interpret these results as supportive of a sentiment interpretation of the housing market, and less consistent with an informational interpretation of the media index. A story of information would instead have to account for why certain unobserved fundamentals would have greater effects in markets with more homebuyers vulnerable to sentiment than others. The amplified effects of sentiment and subprime lending provide a potential explanation for why prior studies have been unable to account for the magnitude of house price changes with the share of minority buyers or expansion of easy credit alone. Without a strictly exogenous instrument for sentiment, this paper makes a careful effort to avoid making any conclusions about causality. Causes of the most recent housing cycle most likely cannot be summarized by one single factor, and the cross-sectional analysis of this paper suggests that the driving factors behind the last boom are more complicated. Expectations and fundamentals likely have a more complex relationship; for example, where a subset of homebuyers may systematically overreact to a positive shock from lower interest rates or increases in credit supply. These results strongly suggest that sentiment should be taken seriously as a potential determinant of house prices and deserves greater attention in future research and policy concerns. In particular, the results of this paper suggest future work might address a greater understanding of what specific factors drive sentiment, whether the media plays a role in perpetuating financial mistakes, and if these factors can be improved with current financial education and literacy policies. I thank Fernando Ferreira, Joe Gyourko, Olivia Mitchell, Michael Roberts, and Todd Sinai and two anonymous referees for their valuable comments and feedback. I also thank the seminar participants at Wharton; UC-Berkeley; George Mason University; Harvard Business School; HEC Paris; University of Illinois at Urbana-Champaign; Miami University; University of Michigan; Michigan State; New York University; Washington University at St. Louis; University of Rochester Simon; Federal Reserve Banks of Boston, Chicago, and Philadelphia; the University of California-Davis Annual Symposium, the NBER Behavioral Finance Workshop, the NBER Real Estate Summer Institute, the AEA/ASSA Meetings; and the Whitebox Advisors Graduate Student Conference for their helpful suggestions and comments. I am grateful to Dataquick for providing data for this project during my graduate studies at Wharton. Supplementary data can be found on The Review of Financial Studies Web site. Footnotes 1 These lists can be found at http://www.wjh.harvard.edu/inquirer/Increas.html and http://www.wjh.harvard.edu/inquirer/rise.html. My dictionary source for synonyms is Roget 21st Century Thesaurus, 3rd Edition. 2 Full word lists are available upon request. 3 This calculation essentially treats all articles in one period as one long document; an alternative method is to calculate the share of positive and negative words in each individual article and then average across articles. I try both methods and do not find a difference in values. 4 Loughran and McDonald (2011) apply the same strategy except with a preceding word distance of three words. Textual analysis studies in the computer science field use a preceding distance of five words, so I opt for the wider window. 5 Details on alternative versions and their correlations with the baseline index are available on request. 6 The data for these proxies come from DataQuick, a proprietary transaction deeds records database. 7 See http://www.elliemae.com/. 8 I find sentiment has predictive power for prices up to $$k=8$$, but limit the lag structure to four quarters to conserve degrees of freedom. The Online Appendix provides the results with an eight-quarter lag structure. 9 For full point estimates, please refer to the Online Appendix. 10 Specifically, I control for loan-to-value ratios, the fraction of subprime loans following Ferreira, Gyourko, and Tracy (2010), and the 6-month London Interbank Offered Rate (LIBOR). Please see the Online Appendix for full regression results. 11 Similar strategies are used in to measure speculators in Ferreira and Gyourko (2012), Chinco and Mayer (2016), and Bayer, Gelssler, and Roberts (2011). References Agarwal, S., Duchin, R. Evanoff, D. and Sosyura. D. 2012 . In the mood for a loan: The causal effect of sentiment on credit origination. Working Paper . Akerlof, G., and Shiller. R. J. 2009 . Animal spirits: How human psychology drives the economy and why it matters for global capitalism . Princeton, NJ : Princeton University Press . Altonji, J. G., Elder, T. E. and Taber. C. R. 2005 . Selection on observed and unobserved variables: Assessing the effectiveness of catholic schools. Journal of Political Economy 113 : 151 – 84 . Google Scholar Crossref Search ADS Angriest, J. D., and Krueger. A. B. 1999 . Empirical strategies in labor economics. In Handbook of labor economics , 1277 – 366 . Eds. Card D. and Ashenfelter. O. Amsterdam : Elsevier . Antweiler, W., and Murray. F. Z. 2004 . Is all that talk just noise? The information content of internet stock message boards. Journal of Finance 59 : 1259 – 94 . Google Scholar Crossref Search ADS Baker, M., and Wurgler. J. Investor sentiment and the cross-section of stock returns. Journal of Finance 61 : 1645 – 80 . Crossref Search ADS Baker, M., and Wurgler. J. Investor sentiment in the stock market. Journal of Economic Perspectives 21 : 129 - 51 . Crossref Search ADS Barberis, N., Shleifer, A. and Vishny. R. A model of investor sentiment. Journal of Financial Economics 49 : 307 – 43 . Crossref Search ADS Bayer, P., Gelssler, C. and Roberts. J. W. Speculators and middlemen: The role of flippers in the housing market. Working Paper , National Bureau of Economics Research . Campbell, J. Y., and Kyle. A. S. 1993 . Smart money, noise trading and stock price behaviour. Review of Economic Studies 60 : 1 – 34 . Google Scholar Crossref Search ADS Campbell, J. Y., Giglio, S. and Pathak. P. 2011 . Forced sales and house prices. American Economic Review 101 : 2108 – 31 . Google Scholar Crossref Search ADS Case, K. E., and Shiller. R. J. 1988 . The behavior of home buyers in boom and post-boom markets. New England Economic Review November/December : 29 – 46 . Case, K. E., and Shiller. R. J. 1989 . The efficiency of the market for single-family homes. American Economic Review 79 : 125 – 37 . Case, K. E., and Shiller. R. J. 2003 . Is there a bubble in the housing market? Brookings Paper on Economic Activity . Case, K. E., Shiller, R. J. and Thompson. A. 2012 . What have they been thinking? Home buyer behavior in hot and cold markets. Brookings Paper on Economic Activity . Chinco, A., and Mayer. C. 2016 . Misinformed speculators and misplacing in the housing market. Review of Financial Studies . 29 : 486 – 522 . Google Scholar Crossref Search ADS De Long, J. B., Shleifer, A. Summers, L. H. and Waldmann. R. J. 1990a . Noise trader risk in financial markets. Journal of Political Economy 98 : 703 – 38 . Google Scholar Crossref Search ADS De Long, J. B., Shleifer, A. Summers, L. H. and Waldmann. R. J. 1990b . Positive feedback investment strategies and destabilizing rational speculation. Journal of Finance 45 : 379 – 95 . Google Scholar Crossref Search ADS Demyanyk, Y., and Hemert. O. 2011 . Understanding the subprime mortgage crisis. Review of Financial Studies 24 : 1848 – 80 . Google Scholar Crossref Search ADS Dougal, C., Engelbert, J. Garcia, D. and Parsons. C. A. 2012 . Journalists and the stock market. Review of Financial Studies 25 : 639 – 79 . Google Scholar Crossref Search ADS Driscoll, J. C., and Kraay. A. C. 1998 . Consistent covariance matrix estimation with spatially dependent panel data. Review of Economics and Statistics 80 : 549 – 60 . Google Scholar Crossref Search ADS Engelberg, J. 2008 . Costly information processing: Evidence from earnings announcements. Working Paper . Feldman, R., Joshua, G. and Segal. B. 2008 . The incremental information content of tone change in management discussion and analysis. Working Paper . Ferreira, F., and Gyourko. J. 2011 . Anatomy of the beginning of the housing boom: U.S. neighborhoods and metropolitan areas. Working Paper , National Bureau of Economics Research . Ferreira, F., and Gyourko. J. Heterogeneity in neighborhood-level price growth in the United States, 1993-2009. American Economic Review 102 : 134 – 40 . Crossref Search ADS Ferreira, F., Gyourko., J. and Tracy. J. 2010 . Housing busts and household mobility. Journal of Urban Economics 68 : 34 – 45 . Google Scholar Crossref Search ADS Foote, C. 2007 . Space and time in macroeconomics panel data: young workers and state-level unemployment revisited. Working Paper , Federal Reserve Bank of Boston . Foote, C. L., Gerardi, K. and Willen. P. S. 2008 . Negative equity and foreclosure: Theory and evidence. Journal of Urban Economics 64 : 234 – 45 . Google Scholar Crossref Search ADS Galbraith, J. 1990 . A short history of financial euphoria . New York : Viking Press . Gentzkow, M., and Shapiro. J. M. 2010 . What drives media slant? Evidence from U.S. daily newspapers. Econometrica . 78 : 35 – 71 . Google Scholar Crossref Search ADS Gerardi, K., Lehnert, A. Sherlund, S. and Willen. P. 2008 . Making sense of the subprime crisis. Brookings Paper on Economic Activity . Glaeser, E., Gottlieb, J. and Gyourko. J. 2010 . Can cheap credit explain the housing boom? Working Paper , National Bureau of Economics Research . Goetzmann, W., Peng, L. and Yen. J. 2012 . The subprime crisis and house price appreciation. Journal of Real Estate Finance and Economics 44 : 36 – 66 . Google Scholar Crossref Search ADS Greenwood, R., and Shleifer. A. 2014 . Expectations of returns and expected returns. Review of Financial Studies 27 : 714 ×΄ 6 . Google Scholar Crossref Search ADS Hanley, E., and Hoberg. G. 2010 . The information content of IPO prospectuses. Review of Financial Studies 23 : 2821 – 64 . Google Scholar Crossref Search ADS Himmelberg, C., Mayer, C. and Sinai. T. 2005 . Assessing high house prices: Bubbles, fundamentals, and misperceptions. Journal of Economic Perspectives 19 : 67 – 92 . Google Scholar Crossref Search ADS Jegadeesh, N., and Wu. D. 2013 . Word power: A new approach for content analysis. Journal of Financial Economics 110 : 712 – 29 . Google Scholar Crossref Search ADS Keys, B. J., Seru, A. and Vig. V. 2012 . Lender screening and the role of securitization: Evidence from subprime loans. Review of Financial Studies 25 : 2071 – 108 . Google Scholar Crossref Search ADS Keynes, J. M. 1936 . The general theory of employment, interest and money . London : Macmillan . Kindleberger, C. 1936 . Manias, panics, and charts: A history of financial crises . Hoboken, NJ : John Wiley and Sons . Lai, R. N., and Van Order. R. A. Momentum and house price growth in the United States: Anatomy of a bubble. Real Estate Economics 38 : 753 – 73 . Crossref Search ADS Li, F. 2006 . Do stock market investors understand the risk sentiment of corporate annual reports? Working Paper . Loughran, T., and Mcdonald. B. 2011 . When is liability not a liability? Textual analysis, dictionaries, and 10-Ks. Journal of Finance 66 : 35 – 65 . Google Scholar Crossref Search ADS Lusardi, A., and Mitchell. O. 2007 . Baby boomer retirement security: The roles of planning, financial literacy, and housing wealth. Journal of Monetary Economics 54 : 205 ×2 2 . Google Scholar Crossref Search ADS Mayer, C., and Sinai. T. 2009 . U.S. house price dynamics and behavioral finance. In Policy making insights from behavioral economics , chapter 5 . Eds. Foote, C. Goethe, L. and Meier. S. Boston : Federal Reserve Bank of Boston . Mian, A., and Sufi. A. 2009 . The consequences of mortgage credit expansion: Evidence from the U.S. mortgage default crisis. Quarterly Journal of Economics 124 : 1449 – 96 . Google Scholar Crossref Search ADS Mian, A., and Sufi. A. 2011 . House prices, home equity-based borrowing and the U.S. household leverage crisis. American Economic Review 101 : 2132 – 56 . Google Scholar Crossref Search ADS Mullainathan, S., and Shleifer. A. 2005 . The market for news. American Economic Review 95 : 1031 – 53 . Google Scholar Crossref Search ADS Nakajima, M. 2005 . Rising earnings instability, portfolio choice, and housing prices. Working Paper . Nakajima, M. 2009 . Understanding house price dynamics. Business Review Q2 : 20 – 28 . Newey, W., and West. K. 1987 . A simple, positive, semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix. Econometrica 78 : 35 – 71 . Piazzesi, M., and Schneider. M. 2009 . Momentum traders in the housing market: Survey evidence and a search model. American Economic Review 99 : 406 – 11 . Google Scholar Crossref Search ADS Roback, J. 1982 . Wages, rents, and quality of life. Journal of Political Economy 90 : 1257 – 78 . Google Scholar Crossref Search ADS Rosen, S. 1979 . Wage-based indexes of urban quality of life. In Current issues in urban economics . Baltimore : Johns Hopkins University Press . Shiller, R. J. 2005 . Irrational exuberance . Princeton, NJ : Princeton University Press . Shiller, R. J. 2008 . The subprime solution . Princeton, NJ : Princeton University Press . Shiller, R. J. 2009 . Animal spirits . Princeton, NJ : Princeton University Press . Tetlock, P. 2007 . Giving content to investor sentiment: The role of media in the stock market. Journal of Finance 62 : 1139 – 68 . Google Scholar Crossref Search ADS Tetlock, P., Saar-Tsechansky, M. and Macskassy. S. 2008 . More than words: Quantifying language to measure firms’ fundamentals. Journal of Finance 63 : 1437 – 67 . Google Scholar Crossref Search ADS Tracy J., Schneider, H. and Chan. S. 1999 . Are stocks overtaking real estate in household portfolios? Current Issues in Economics and Finance 5 : 1 – 6 . © The Author(s) 2018. Published by Oxford University Press on behalf of The Society for Financial Studies. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices)

### Journal

The Review of Financial StudiesOxford University Press

Published: Oct 1, 2018

## You’re reading a free preview. Subscribe to read the entire article.

### DeepDyve is your personal research library

It’s your single place to instantly
that matters to you.

over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month ### Explore the DeepDyve Library ### Search Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly ### Organize Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place. ### Access Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals. ### Your journals are on DeepDyve Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more. All the latest content is available, no embargo periods. DeepDyve ### Freelancer DeepDyve ### Pro Price FREE$49/month
\$360/year

Save searches from
PubMed

Create folders to

Export folders, citations