Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Equity returns and sentiment

Equity returns and sentiment 1IntroductionQuantifying subjective opinion and using it as a predictor of stock market returns and prices have become an important topic of research and empirical analysis in academia and the industry. According to the efficient-market hypothesis (EMH) developed by Fama (see Fama et al. [11], Fama [12], and the references therein), information and thus the news are driving sources of equity prices as an “asset price reflects all available information.”This paper focuses on the analysis of the question whether public sentiment provides valuable information that affects stock prices and on quantifying the significance of the effects of public opinion and sentiment for equity prices. The motivation for asking these questions is inspired by the research and the developments in behavioral economics and behavioral finance. In particular, research in these fields suggests that asset prices could be affected by human psychological or behavioral factors (Bollen et al. [4]). For instance, according to Gruhl et al. [17], Liu et al. [24], and Mishne and de Rijke [27], book, movie, and other products’ sales can be predicted by sentiment in social media such as blogs, Twitter posts, and so on. Hence, it is reasonable to assume that public feeling could affect the returns and prices of financial assets and indices. This is further consistent with the research in psychology, which indicates that emotion plays a significant role in human decision-making (Damasio [9]). Studies in behavioral finance and related fields also indicate that emotion and sentiment have a meaningful contribution to financial decisions and investors’ performance [25,28]. Moreover, research indicates that downward pressure on market prices is related to high media pessimism indicated by publications in the Wall Street Journal [33].To study the effect of public mood on the stock price, we need to find a reliable, representative, and accessible proxy that can be used to construct time series on sentiment measures. Large-scale surveys for obtaining public mood are impractical, not only they are a waste of resources, but also they have great difficulty in producing a time-sensitive data. On the other hand, Twitter, a popular social media website that was launched in 2006, has millions of posts per day. The average number of monthly active Twitter users is over 330 million. Users of Twitter come from a variety of backgrounds, including CEOs, analysts, as well as the users’ major component, the general public. Therefore, it is reasonable to choose the sentiment of Twitter posts as a proxy for the public mood [4]. Many papers in the literature have focused on the analysis of the relation between Twitter sentiment and stock market returns (see, among others, Behrendt and Schmidt [3], Corea [8], Groß-Kluß mann et al. [16], Mittal and Goel [26], Ranco et al. [30], Washha et al. [34].In the first stage, this paper investigates the relationship between prices and public mood for three major indexes, S&P 500, NASQAD, FTSE 100, and one corporation stock, Apple Inc, using a large database of collected posts on Twitter with the indices and the stock’ tickers as keywords. In particular, the database includes approximately 8,029,963 posts for each keyword (3,000 randomly picked Tweets in each day) from January 1, 2008 to April 1, 2016.The period considered thus does not include the beginning of the ongoing COVID-19 pandemic, and the analysis of structural breaks due to this and other shocks is beyond this paper and is left for future research.We use Granger Causality tests to investigate whether a change in public sentiment can cause a change in stock prices.In the second stage, we explore models that may be able to further explain the relationship between Twitter sentiment and the returns and prices on the S&P 500 and FTSE 100 indices. We employ and estimate Nonlinear Generalized AutoRegressive Conditional Heteroskedasticity (NGARCH) time series to explain and quantify the relationship and to model the effects of Twitter sentiment on volatility clustering in financial markets.At the third stage, we focus on the analysis of the effects of Twitter sentiment on market volatility using the fitted GARCH models and Granger causality tests.2Data acquisitionThis section discusses where and how the stock price and Twitter data are acquired and describes the methods for quantifying the sentiment in the text from Twitter posts and generating the data on sentiment measures that can be used in the analysis, tests, and model development.The data on stock prices for the assets considered are downloaded from Yahoo Finance.For the sentiment data, an R program has been developed by using Twitter Application Interface (API) to download an assigned and fixed number of Twitter posts (or tweets) with a particular keyword for each day throughout the time interval in consideration. In particular, we used the CRAN version (stable) twitteR libratry [14], which developed based on standard Twitter API (Twitter application access API at http://dev.twitter.com/). We used the “searchTwitter” function in this package to obtain data for a given date, keywords, maximum return tweets (the maximum number of returning tweets are limited by the API capability at the time), and other search strings. This program analyzed a total 100 GB of data from Twitter. Following Tetlock [33], to quantify sentiment, this paper uses various types of sentiment measures as suggested in the General Inquirer Categories in Harvard Psychological Dictionary [13].The general analysis procedure employed in the paper is depicted in the flow graph (Figure 1).Figure 1The analysis procedure.2.1Twitter mining algorithmThis section describes the algorithm that was used to convert Twitter text data into time series of different sentiment measures.First, for each day in the period from January 1, 2008 to January 4, 2016, a random sample of 3,000 Tweets was extracted. All Tweets on the same day were collected to form a large text file that was used as a proxy for public comments on Twitter. For each of the downloaded daily text file, all the punctuation and other symbols (e.g. “https://”) were removed to form a crude corpus. In the crude corpus, we applied a further filtration for removing any meaningless (for the purpose of sentiment quantification) words, such as “is,” “this,” etc., to form the final daily corpus.Second, such daily generated corpus was checked using the Harvard Psychological Dictionary’s General Inquirer Categories [13] with four broad classes Positive, Negative, Active, and Passive, and also their subclasses Affiliation, Hostile, Strong, and Weak, by counting the frequency of words in the corpus that fall into a particular category. Hence, for each day, eight values for the word frequency in each of the group were obtained from the collected Tweets. The process has been repeated for every day in time interval dealt with. Following the procedure, time series of the raw sentiment data from Twitter were generated.The example in Appendix A1 demonstrates how the algorithm works for randomly acquired five Tweets on a particular date. The Tweets considered in Appendix A1 are not sentimentally neutral and contain polarized oriented words such as “drop,” “Active,” “unable,” etc. that indicates their sentiment orientation to some extend. This observation also provides a logical justification for using the categorization method in the paper.Third, to get a time series for testing the Granger causality between the Twitter sentiment and the stock price, several different sentiment measures are used in the paper. The sentiment measures considered are inspired by the analysis by Zhang and Skiena [35] and are summarized in the following formulas, with #Positive and #Negative, etc. standing for the number of words in the positive, negative, and other corresponding sentiment categories. The polarity sentiment measure is defined as follows: (2.1)Polarity=#Positive−#Negative#Positive+#Negative.\hspace{0.1em}\text{Polarity}=\frac{\#\text{Positive}-\#\text{Negative}}{\#\text{Positive}+\#\text{Negative}\hspace{0.1em}}.Obviously, the Polarity measure is not guaranteed to always be positive similar to asset prices as the number of positive words in the Twitter posts considered is not necessarily larger than the word count of negative words. To ensure positivity of the sentiment measures considered, without loss of generality, Zhang and Skiena’ Polarity measures are modified as the “Relative positive” measures, which are defined as follows: (2.2)Relative_positive=#Positive#Positive+#Negative.\hspace{0.1em}\text{Relative}\text{\_}\text{positive}=\frac{\#\text{Positive}}{\#\text{Positive}+\#\text{Negative}\hspace{0.1em}}.In a similar way, we define Relative Affiliation, Relative Active, and Relative Strong measures as follows.(2.3)Relative_Active=#Active#Active+#Passive.\hspace{0.1em}\text{Relative}\text{\_}\text{Active}=\frac{\#\text{Active}}{\#\text{Active}+\#\text{Passive}\hspace{0.1em}}.The categories Affiliation and Strong are subclasses of the Positive and Active categories. Their relative sentiment measures are defined as follows: (2.4)Relative_Affiliation=#Affiliation#Positive+#Negative,\hspace{0.1em}\text{Relative}\text{\_}\text{Affiliation}=\frac{\#\text{Affiliation}}{\#\text{Positive}+\#\text{Negative}\hspace{0.1em}},(2.5)Relative_Strong=#Strong#Active+#Passive.\hspace{0.1em}\text{Relative}\text{\_}\text{Strong}=\frac{\#\text{Strong}}{\#\text{Active}+\#\text{Passive}\hspace{0.1em}}.3Granger causality analysisAs the sentiment data have been acquired, Granger causality tests were performed to investigate whether there is a causality between the Twitter sentiment and the stock price, which also gives a partial answer to the question on the relationship between these variables. The Ganger causality tests are applied to time series confirmed to be stationary II(0) processes.3.1(Non-)stationarity analysisFirst, we conduct augmented Dickey-Fuller (ADF) tests to investigate the presence of unit root in the processes considered. Testing for a unit root in a time series Xt{X}_{t}is based on the ordinary least squares (OLS) regression: Xt=β0+αt+δXt−1+γ1ΔXt−1…+γpΔXt−p+εt,{X}_{t}={\beta }_{0}+\alpha t+\delta {X}_{t-1}+{\gamma }_{1}\Delta {X}_{t-1}\ldots +{\gamma }_{p}\Delta {X}_{t-p}+{\varepsilon }_{t},where β0{\beta }_{0}is a constant, α\alpha is a time trend coefficient, and εt{\varepsilon }_{t}is the innovation process with zero mean. Under H0:δ=0{{\bf{H}}}_{{\bf{0}}}:\delta =0, the process is nonstationary, and Ha:δ<0{{\bf{H}}}_{{\bf{a}}}:\delta \lt 0corresponds to stationarity in Xt{X}_{t}(Dickey and Fuller [10], Ch. 17 in Hamilton [19] and Section 15.7 in Stock and Watson [31]).The results of the ADF tests indicate that all the processes considered, including all the time series of sentiment measures dealt with and the logarithms of stock prices, are unit root II(1) processes. The analysis and tests of Granger causality in the paper are therefore based on stationary I(0) first differences of the aforementioned processes. In other words, in the following section, we focus on testing of Granger causality between the changes of the log price and the changes in the Twitter sentiment measures considered.3.2Autoregressive distributed lag models in Granger causality testsA process Xt{X}_{t}is said to Granger cause a process Yt{Y}_{t}if the lags of Xt{X}_{t}have useful predictive content for forecasting Yt,{Y}_{t},above and beyond the other regressors, e.g., the lags of Yt{Y}_{t}itself, in the model (see, among others, Section 15.4 in [31]). The Granger causality test is usually carried out using autoregressive distributed lag (ADL) models (3.1)Yt=β0+∑j=1JβjYt−j+∑k=1KγkXt−k+ut.{Y}_{t}={\beta }_{0}+\mathop{\sum }\limits_{j=1}^{J}{\beta }_{j}{Y}_{t-j}+\mathop{\sum }\limits_{k=1}^{K}{\gamma }_{k}{X}_{t-k}+{u}_{t}.The Granger causality test is carried out using an FF-test on all the coefficients on the lags of Xt{X}_{t}. The null hypothesis in this test is H0:∀k∈{1,…,K},γk=0{{\bf{H}}}_{{\bf{0}}}:\forall k\in \left\{1,\ldots ,K\right\},{\gamma }_{k}=0, which equivalently means that Xt{X}_{t}is not a useful predictor of Yt,{Y}_{t},given the lags of the latter process. The alternative hypothesis in the test is Ha:∃k∈{1,…,K},γk≠0{{\bf{H}}}_{{\bf{a}}}:\exists k\in \left\{1,\ldots ,K\right\},\hspace{0.33em}{\gamma }_{k}\ne 0, which corresponds to the property that the lags of Xt{X}_{t}do have some useful predictive content for forecasting Yt,{Y}_{t},beyond the lags of Yt{Y}_{t}itself.Let Pt{P}_{t}denote the logarithm of the price of a stock/index considered and let Sem denote a sentiment measure. The test whether the Twitter sentiment Granger causes the stock price is based on the following model: (3.2)ΔPt=β0+∑j=1JβjΔPt−j+∑k=1KγkΔSemt−k+ut.\Delta {P}_{t}={\beta }_{0}+\mathop{\sum }\limits_{j=1}^{J}{\beta }_{j}\Delta {P}_{t-j}+\mathop{\sum }\limits_{k=1}^{K}{\gamma }_{k}\Delta {{\rm{Sem}}}_{t-k}+{u}_{t}.Similarly, the test that the stock price Granger causes the Twitter sentiment is based on the model (3.3)ΔSemt=β0+∑m=1MβmΔSemt−m+∑n=1NγnΔPt−n+εt.\Delta {{\rm{Sem}}}_{t}={\beta }_{0}+\mathop{\sum }\limits_{m=1}^{M}{\beta }_{m}\Delta {{\rm{Sem}}}_{t-m}+\mathop{\sum }\limits_{n=1}^{N}{\gamma }_{n}\Delta {P}_{t-n}+{\varepsilon }_{t}.3.3Determination of the number of lags using the BIC criterionAs shown in Eqs. (3.1)–(3.3), implementation of Granger causality tests requires determination of the number of lags J,K,M,NJ,K,M,Nfor both of the processes considered. We use the the Bayesian information criterion (BIC) for deterrmination of the number of lags of processes Xt{X}_{t}and Yt{Y}_{t}(ΔPt\Delta {P}_{t}and ΔSen\Delta {\rm{Sen}}) in the ADL models dealt with. More precisely, as usual, first, the number JJof lags in autoregressive (AR) models for the process Yt{Y}_{t}Yt=β0+∑j=1JβjYt−j+ut{Y}_{t}={\beta }_{0}+\mathop{\sum }\limits_{j=1}^{J}{\beta }_{j}{Y}_{t-j}+{u}_{t}is determined based on the BIC, and then the criterion is applied to determine the number of lags KKof the potential predictor Xt{X}_{t}in model (Eq. (3.1)) with the estimated number JJ.The results of the lag length selection on the basis of the BIC for Granger causality tests are provided in Appendix A2.3.4The results of Granger causality testsThe results of tests of Granger causality between the considered asset returns and the corresponding Twitter sentiment are provided in Tables 1 and 2.Table 1FF-statistics for the hypothesis H0{H}_{0}: the changes in sentiment do not Granger cause the asset returnsAnalyzed tweetsRelative positiveRelative activeRelative affiliationRelative strongS&P5009,116,5370.0024253∗∗∗0.002425{3}^{\ast \ast \ast }0.010023∗∗0.01002{3}^{\ast \ast }0.01043∗∗0.0104{3}^{\ast \ast }0.0106∗∗0.010{6}^{\ast \ast }FTSE1009,099,0000.012089∗∗0.01208{9}^{\ast \ast }0.022854∗∗0.02285{4}^{\ast \ast }0.017246∗∗0.01724{6}^{\ast \ast }0.011853∗∗0.01185{3}^{\ast \ast }NASDAQ8,826,7750.858840.441550.838460.64976AAPL7,608,0000.403840.868420.268750.13011Notes: ∗∗∗{}^{\ast \ast \ast }indicates the 1% significance and ∗∗{}^{\ast \ast }indicates the 5% significance.Table 2FF-statistics for the hypothesis H0{H}_{0}: the asset returns do not Granger cause the changes in sentimentAnalyzed tweetsRelative positiveRelative activeRelative affiliationRelative strongS&P5009,116,5370.888780.476150.234480.94569FTSE1009,099,0000.927180.330620.570010.80284NASDAQ8,826,7750.756280.664840.540520.91209AAPL7,608,0000.866360.623850.673290.69855Similar to Section 3.2, the null hypotheses in the tables are that the changes in sentiment do not Granger cause the change in the log prices, that is, the returns, and vice versa.The results in Table 2 indicate that, somewhat surprisingly, that the changes in (log) prices apparently do not Granger cause the Twitter sentiment. These conclusions are in contrast to the conventional belief that the changes in asset prices affect public sentiment.Further, the results in Tables 1 and 2 point to the conclusion that the change of Twitter sentiment related to S&P 500 and FTSE 100 indices Granger causes their price changes but not vice versa. In particular, according to Table 1, among all the sentiment measures and the assets considered, the effect of the Relative Positive measure on S&P 500 returns appears to be the most significant, with significance of the test statistics at the 1% level. In contrast, the returns on the NASDAQ index and the Apple stock appear not to be Granger caused by the respective Twitter sentiment of price.3.5S&P 500: Granger causality tests using the big dataTo further evaluate and confirm the results on Granger causality in the previous section, we conduct the tests of Granger causality between the S&P 500 returns and the respective sentiment measures using a very large-scale database.Different from the first stage analysis (3,000 randomly selected with target keywords, by limiting the maximum return tweet in twitteR: searchTwitter function), in this section, we do not give a limit to the maximum return tweets in twitteR: searchTwitter function, just go for the max number that Twitter application access API can provide in one request. Tweet with the indices and the stock tickers as keywords are applied in the search. Also, The twitteR Library are based on Twitter Application access API. The Twitter Application access API not only limit the maximum number of tweets in each request but also limit a certain amount requests in a time period with a given IP. To acquire large data for analysis, We have registered multiple accounts (Hence multiple tokens) and switch IP each time when a request were sent. The analysis is based on all the acquirable data from the Twitter API for the index in the time period considered. The analysis is not conducted for the FTSE 100 index as there are much fewer posts related to it as compared to the S&P 500. As indicated earlier, for the S&P 500, as many data as possible were extracted and analyzed for each day in the time period dealt with, with approximately 14,005 Tweets per day (this number is a mathematical average estimation based on all obtained data with the target keywords) and 42,478,072 Tweets in total.The general observation was that the number of obtained tweets with corresponding keyword in each day is increasing with time. This is consisted with the fact that tweet is getting more and more well known and more people are posting their thoughts on tweet over time. For example, some keys have only 5,000 tweets per day in 2008, but obtained tweets number per day gets more in the follow year. Eventually the obtained daily number of tweets by a specific keyword is limited by the Twitter API we used.The results of Granger causality tests for the S&P 500 returns and the sentiment measures using the large-scale database are provided in the second rows of Tables 3 and 4. For comparison, we also provide, in the first rows of the tables, the Granger causality test statistics for the S&P 500 from the previous section.Table 3FF-statistics for the hypothesis H0{H}_{0}: the changes in sentiment do not Granger cause the S&P 500 returnsAnalyzed tweetsRelative positiveRelative activeRelative affiliationRelative strong1st attempt9,116,5370.0024253∗∗∗0.002425{3}^{\ast \ast \ast }0.010023∗∗0.01002{3}^{\ast \ast }0.01043∗∗0.0104{3}^{\ast \ast }0.0106∗∗0.010{6}^{\ast \ast }2nd attempt42,478,0720.0054962∗∗∗0.005496{2}^{\ast \ast \ast }0.0042141∗∗∗0.004214{1}^{\ast \ast \ast }0.0036191∗∗∗0.003619{1}^{\ast \ast \ast }0.0086103∗∗∗0.008610{3}^{\ast \ast \ast }Notes: ∗∗∗{}^{\ast \ast \ast }indicates the 1% significance and ∗∗{}^{\ast \ast }indicates the 5% significance.Table 4FF-statistics for the hypothesis H0{H}_{0}: the S&P 500 returns do not Granger cause the change in sentimentAnalyzed tweetsRelative positiveRelative activeRelative affiliationRelative strong1st attempt9,116,5370.888780.476150.234480.945692nd attempt42,478,0720.787770.100340.673030.72055The results in Tables 3 and 4 using the large-scale data confirm the results in the previous section that the Twitter sentiment related to the S&P 500 index appears to Granger cause its returns and (log) price changes but not vice versa.In conclusion, the returns and the prices of the S&P 500 and FTSE 100 indices appear to be Granger caused by the public sentiment. On the other hand, according to the results in this and the previous section, the changes in prices of the assets considered appear not to Granger cause the respective sentiment.4Causality modeling: ADL and GARCH modelsAs discussed in the previous section, the Twitter sentiment appears to Granger cause the returns on the S&P 500 and FTSE 100 indices. In this section, we focus on the analysis of models for the relationship between the returns on the indices and the respective sentiment. In particular, we evaluate the ADL models for the relationship and further fit GARCH time series to model the effects of sentiment volatility on market volatility.4.1Volatility clusteringWe first focus on the estimation of ADL models for the relationship between the returns on the S&P 500 and FTSE 100 indices and the lags of the sentiment measures. Similar to the analysis in the previous section, following the results in Section 3.1, the models are estimated for the stationary changes in the log prices – the returns – and the stationary changes in the measures of sentiment dealt with.The estimated ADL models include all the sentiment measures that were shown in the previous section to Granger cause the returns on the indices. The estimated models thus have the following form: (4.1)ΔPt=β0+∑k=1KβkΔPt−k+∑i=1IζiΔPost−i+∑j=1JγjΔActt−j+∑l=1LηlΔAfft−l+∑m=1MαmΔStrt−m+εt,\Delta {P}_{t}={\beta }_{0}+\mathop{\sum }\limits_{k=1}^{K}{\beta }_{k}\Delta {P}_{t-k}+\mathop{\sum }\limits_{i=1}^{I}{\zeta }_{i}\Delta {{\rm{Pos}}}_{t-i}+\mathop{\sum }\limits_{j=1}^{J}{\gamma }_{j}\Delta {{\rm{Act}}}_{t-j}+\mathop{\sum }\limits_{l=1}^{L}{\eta }_{l}\Delta {{\rm{Aff}}}_{t-l}+\mathop{\sum }\limits_{m=1}^{M}{\alpha }_{m}\Delta {{\rm{Str}}}_{t-m}+{\varepsilon }_{t},where {ΔPt}\left\{\Delta {P}_{t}\right\}, {ΔPost}\left\{\Delta {{\rm{Pos}}}_{t}\right\}, {ΔActt}\left\{\Delta {{\rm{Act}}}_{t}\right\}, {ΔAfft}\left\{\Delta {{\rm{Aff}}}_{t}\right\}, and {ΔStrt}\left\{\Delta {{\rm{Str}}}_{t}\right\}are the time series of the change in the logarithm of the prices of the indices, and the time series of the changes in the Relative Positive, Relative Active, Relative Affiliation and Relative Strong sentiment measures. The form of the ADL models is motivated by accounting for different classes and subclasses of sentiment in Harvard Psychological Dictionary’s General Inquirer Categories [13] and related measures in Section 2.1 used in the sentiment analysis in the paper. The numbers I,J,LI,J,L, and MMof lags included in the models are those determined by the BIC criterion as discussed in Section 3.3 and Appendix A2.ADL models (4.1) estimated by the OLS result in a poor fit for the time series of the returns on both the S&P 500 and FTSE 100 indices. Further, the plots of the residuals from the ADL regressions point to pronounced volatility clustering in the errors in the estimated linear models.The results on the poor fit of linear models for the returns and the presence of volatility clustering in the ADL regression errors and the returns are in accordance with the well known stylized facts of the absence of linear dependence and the presence of nonlinear dependence in financial returns (see Cont [7]).Following the results, in the next section, we thus focus on the models capturing the volatility clustering in the ADL model errors and the returns on the indices considered.4.2Modeling Granger causality and volatility clustering: ADL models with NGARCH errorsTo model Granger causality between the sentiment and the returns on the S&P 500 and FTSE 100 indices accounting for volatility clustering in the returns, as usual, we employ GARCH-type time series. As is well known, GARCH-type processes can be used to capture and model the most of the stylized facts of financial returns, including the absence of linear autocorrelations, the presence of volatility clustering and autocorrelations in squared returns, heavy-tailedness and conditional heavy-tailedness, and the leverage effect (see, among others, Alberg et al. [2], Christoffersen [6], and Cont [7]). ℱt{{\mathcal{ {\mathcal F} }}}_{t}denotes the filtration that contains all the information up to time tt, and N(0,1)N\left(0,1)and t(ν)t\left(\nu )denote the standard normal and (heavy-tailed) Student-ttdistribution with ν\nu degrees of freedom, respectively.The following ADL models with NGARCH errors exhibiting the aforementioned stylized facts are estimated using the maximum likelihood (ML): (4.2)ΔPt=ω0+∑k=1KωkΔPt−k+∑i=1IζiΔPost−i+∑j=1JγjΔActt−j+∑l=1LηlΔAfft−l+∑m=1MαmΔStrt−m+ut,\Delta {P}_{t}={\omega }_{0}+\mathop{\sum }\limits_{k=1}^{K}{\omega }_{k}\Delta {P}_{t-k}+\mathop{\sum }\limits_{i=1}^{I}{\zeta }_{i}\Delta {{\rm{Pos}}}_{t-i}+\mathop{\sum }\limits_{j=1}^{J}{\gamma }_{j}\Delta {{\rm{Act}}}_{t-j}+\mathop{\sum }\limits_{l=1}^{L}{\eta }_{l}\Delta {{\rm{Aff}}}_{t-l}+\mathop{\sum }\limits_{m=1}^{M}{\alpha }_{m}\Delta {{\rm{Str}}}_{t-m}+{u}_{t},where (4.3)ut=εtσt,εt∣ℱt−1∼N(0,1)ort(v),{u}_{t}={\varepsilon }_{t}{\sigma }_{t},\hspace{1em}{\varepsilon }_{t}| {{\mathcal{ {\mathcal F} }}}_{t-1}\hspace{0.33em} \sim \hspace{0.33em}N\left(0,1)\hspace{0.33em}\hspace{0.1em}\text{or}\hspace{0.1em}\hspace{0.33em}t\left(v),with i.i.d. innovations εt{\varepsilon }_{t}that have a standard normal or Student-ttdistributions, and the volatility dynamics is given by the NGARCH model in the following form: (4.4)σt2=α0+α1(ut−1−κ1)2+β1σt−12.{\sigma }_{t}^{2}={\alpha }_{0}+{\alpha }_{1}{\left({u}_{t-1}-{\kappa }_{1})}^{2}+{\beta }_{1}{\sigma }_{t-1}^{2}.The NGARCH specification for the errors in ADL models for index returns considered accounts for the properties of absence of linear autocorrelations, volatility clustering, heavy-tailedness, and leverage effect in returns time series.The models impose stationary, that is, the condition α1+β1<1,{\alpha }_{1}+{\beta }_{1}\lt 1,on the GARCH parameters.The results of the ML estimation of the aforementioned models are provided in the following sections.4.2.1S&P 500As shown in the results in Table 5, unlike linear ADL models estimated by the OLS, the ADL models with NGARCH errors described in the previous section provide an exceptional fit for the S&P 500 returns.Table 5AGARCH model for S&P500Error distributionNormal NN(0,1)Student-ttCoefficienttt-ProbCoefficienttt-ProbΔ\Delta Price−0.0512350-0.05123500.014∗0.01{4}^{\ast }−0.0228679-0.02286790.082∗0.08{2}^{\ast }Δ\Delta Rel_Positive685.6120.088∗0.08{8}^{\ast }539.484<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Δ\Delta Rel_Active−328.132-328.1320.594−108.797-108.797<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Δ\Delta Rel_Affiliation73.63100.854192.964<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Δ\Delta Rel_Strong−234.034-234.0340.361−188.109-188.109<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Constant0.721184<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }0.567255<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }α0{\alpha }_{0}2.843040.002∗∗∗0.00{2}^{\ast \ast \ast }7.785900.006∗∗∗0.00{6}^{\ast \ast \ast }α1{\alpha }_{1}0.0674871<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }0.0789915<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }β1{\beta }_{1}0.917239<0.001∗∗\lt 0.00{1}^{\ast \ast }0.921008<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Asymmetry (κ1{\kappa }_{1})Degree of freedom (Student-tt)N/AN/A2.37587≪0.001∗∗∗\ll 0.00{1}^{\ast \ast \ast }Notes: ∗∗∗{}^{\ast \ast \ast }indicates the 1% significance, ∗∗{}^{\ast \ast }indicates the 5% significance and ∗{}^{\ast }indicates the 10% significance.The results in Table 5 further confirm that the changes in sentiment measures are useful predictors of the changes of the index prices and returns. In particular, in the case of the ADL models with NGARCH errors and heavy-tailed Student-ttinnovations, the lags of the changes in all the sentiment measures appear to be highly significant, with the corresponding pp-values less than 0.001. Further, even in the case of ADL-NGARCH model errors with standard normal innovations, one of the sentiment measures, Relative Positive, exhibits statistical significance in predictive models for the S&P 500 returns.4.2.2FTSE 100Similar to the S&P 500 case, the estimation results in Table 6 for predictive ADL regression models for FTSE 100 returns with NGARCH errors demonstrate statistical significance of the changes in the sentiment measures.Table 6AGARCH Model for FTSE 100Error distributionNormal NN(0,1)Student-ttCoefficienttt-ProbCoefficienttt-ProbΔ\Delta Price−0.0447849-0.04478490.020∗∗0.02{0}^{\ast \ast }−0.0204596-0.02045960.150Δ\Delta Rel_Positive−4356.07-4356.07<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }360.002<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Δ\Delta Rel_Active9606.08<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }1836.06<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Δ\Delta Rel_Affiliation4863.51<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }149.458<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Δ\Delta Rel_Strong−1255.61-1255.61<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }−1380.03-1380.03<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Constant−0.343865-0.3438650.6780.8242100.081∗0.08{1}^{\ast }α0{\alpha }_{0}4.83313×10−1154.83313\times 1{0}^{-115}≪0.001∗∗∗\ll 0.00{1}^{\ast \ast \ast }1.90373×10−141.90373\times 1{0}^{-14}≪0.001∗∗∗\ll 0.00{1}^{\ast \ast \ast }α1{\alpha }_{1}0.0512398<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }0.111713≪0.001\ll 0.001β1{\beta }_{1}0.918548<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }0.888287<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Asymmetry (κ1{\kappa }_{1})40.3643<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }52.3062<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Degree of freedom (Student-tt)N/AN/A2.33052<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Notes: ∗∗∗{}^{\ast \ast \ast }indicates the 1% significance and ∗∗{}^{\ast \ast }indicates the 5% significance.The results in Table 6 indicate statistical significance of the regressors, including all the sentiment measures considered, in the predictive ADL models for FTSE 100 returns both in the case of normal and nonnormal heavy-tailed Student-ttinnovations in NGARCH models for the regression errors. Similar to the previous section, the results confirm predictive power of the sentiment measures for prediction of the returns and further confirm the presence of volatility clustering and other stylized facts in the ADL regression errors and the returns dealt with, the properties not captured by ADL models estimated by the OLS.5Causality between asset price volatility and sentiment volatilityThe results in Sections 3 and 4 indicate the presence of volatility clustering in the errors from the predictive ADL models, with the dynamics that can be modeled using NGARCH time series. In this section, we focus on the analysis of causality between the volatilities of the returns and sentiment processes. Similar to Patton [29], the analysis is based on Granger causality tests using the residuals from fitting GARCH-type models to both of the processes considered.5.1Models for causality between volatilitiesConsider two time series {Xt}\left\{{X}_{t}\right\}and {Yt}\left\{{Y}_{t}\right\}and, as mentioned earlier, denoted by ℱt{{\mathcal{ {\mathcal F} }}}_{t}the filtration containing the information up to time tt. The analysis of causality between the volatilities of the processes is based on Granger causality tests for innovations-residuals zt{z}_{t}and zt′{z}_{t}^{^{\prime} }from the GARCH processes fitted to {Xt}\left\{{X}_{t}\right\}and {Yt}:\left\{{Y}_{t}\right\}:(5.1)Xt=σtzt,zt∣ℱt∼N(0,1),{X}_{t}={\sigma }_{t}{z}_{t},\hspace{1.0em}{z}_{t}| {{\mathcal{ {\mathcal F} }}}_{t}\hspace{0.33em} \sim \hspace{0.33em}N\left(0,1),(5.2)Yt=δtzt′,zt′∣ℱt∼N(0,1){Y}_{t}={\delta }_{t}{z}_{t}^{^{\prime} },\hspace{1.0em}{z}_{t}^{^{\prime} }| {{\mathcal{ {\mathcal F} }}}_{t}\hspace{0.33em} \sim \hspace{0.33em}N\left(0,1)and (5.3)σt2=w+α⋅Xt−12+β⋅σt−12+εt,{\sigma }_{t}^{2}=w+\alpha \cdot {X}_{t-1}^{2}+\beta \cdot {\sigma }_{t-1}^{2}+{\varepsilon }_{t},(5.4)δt2=w′+α′⋅Yt−12+β′⋅δt−12+ηt.{\delta }_{t}^{2}={w}^{^{\prime} }+{\alpha }^{^{\prime} }\cdot {Y}_{t-1}^{2}+{\beta }^{^{\prime} }\cdot {\delta }_{t-1}^{2}+{\eta }_{t}.More precisely, the estimates of the GARCH parameters are obtained using maximum likelihood estimation (MLE), and then tests of Granger causality are conducted for the GARCH model residuals/standardized processes εˆt=Xt/σˆt{\hat{\varepsilon }}_{t}={X}_{t}\hspace{0.1em}\text{/}\hspace{0.1em}{\hat{\sigma }}_{t}and ηˆt=Yt/δˆt{\hat{\eta }}_{t}={Y}_{t}\hspace{0.1em}\text{/}\hspace{0.1em}{\hat{\delta }}_{t}. Granger casality testing described in Section 3 is used to investigate whether there is a causality between {εt}\left\{{\varepsilon }_{t}\right\}and {ηt}\left\{{\eta }_{t}\right\}. In particular, if the tests indicate that {εt}\left\{{\varepsilon }_{t}\right\}Granger causes {ηt}\left\{{\eta }_{t}\right\}, then this implies that the information contains in the past volatility of {Xt}\left\{{X}_{t}\right\}is useful for forecasting the volatility of {Yt}\left\{{Y}_{t}\right\}. In the following analysis, the approach is applied to the time series {Xt}\left\{{X}_{t}\right\}and {Yt}\left\{{Y}_{t}\right\}being the processes of asset returns and the measures of Twitter sentiment considered.5.2Data preparationThe analysis of Granger causality between volatilities is based on standardized returns and sentiment measures.More precisely, given the observations on a time series (e.g., that of the returns or the sentiment measures) {Xt}\left\{{X}_{t}\right\}, we consider its standardized version: (5.5)STD({Xt})≔Xt−X¯sX,∀t,{\rm{STD}}\left(\left\{{X}_{t}\right\}):= \left\{\frac{{X}_{t}-\bar{X}}{{s}_{X}},\hspace{1.0em}\forall t\right\},where, as usual, X¯\bar{X}and sX{s}_{X}denote the sample mean and standard deviation of the time series observations.The analysis is based on the standardized time series STD({rt}){\rm{STD}}\left(\left\{{r}_{t}\right\})and STD({Semt}){\rm{STD}}\left(\left\{{{\rm{Sem}}}_{t}\right\})for the returns and sentiment processes {rt}\left\{{r}_{t}\right\}and {Semt}\left\{{{\rm{Sem}}}_{t}\right\}, respectively.A few (approximately 5 out of 3,500) large outliers are observed in the standardized sentiment measure time series STD({Semt}){\rm{STD}}\left(\left\{{{\rm{Sem}}}_{t}\right\}). The presence of the outliers may be due to a relatively large number of reposts of polarized sentiment-oriented Tweets. This is similar to observations and the discussion in Tetlock [32] on some news having textual similarity with others. In the case of the Twitter, outliers caused by reposts may not be representing the real sentiment, as people can actually write their own comments along with their reposts. Some of the people’s comments along with their reposts could have a sentiment that is totally opposite to the reposted Tweets. Further, the presence of such large outliers may severely affect the fitting of GARCH models, in part due to the analyzed sentiment measures being squared in the analysis.To deal with the outliers, we make the assumption that the maximum change in the standardized sentiment is the same as the maximum change in the standardized return and estimate the following model with GARCH errors (see also Carnero et al. [5]): (5.6)Semt=k⋅Semt⋅1{∣Semt∣>max({∣rt∣})}+ut,{{\rm{Sem}}}_{t}=k\cdot {{\rm{Sem}}}_{t}\cdot {1}_{\left\{| {{\rm{Sem}}}_{t}| \gt \max \left(\left\{| {r}_{t}| \right\})\right\}}+{u}_{t},ut=σtzt,zt∣ℱt−1∼N(0,1){u}_{t}={\sigma }_{t}{z}_{t},\hspace{1em}{z}_{t}| {{\mathcal{ {\mathcal F} }}}_{t-1}\hspace{0.33em} \sim \hspace{0.33em}N\left(0,1)and (5.7)σt2=α0+α1ut−12+β1σt−12+εt.{\sigma }_{t}^{2}={\alpha }_{0}+{\alpha }_{1}{u}_{t-1}^{2}+{\beta }_{1}{\sigma }_{t-1}^{2}+{\varepsilon }_{t}.The estimation results in Appendices 4 and 5 indicates the coefficient kkbeing statistically significant and equal to 1 for all the sentiment measures considered. Hence, we further estimate the GARCH model for the following process: Semt⋅(1−1{∣Semt∣>max({∣rt∣})})=σtzt,{{\rm{Sem}}}_{t}\cdot \left(1-{1}_{\left\{| {{\rm{Sem}}}_{t}| \gt \max \left(\left\{| {r}_{t}| \right\})\right\}})={\sigma }_{t}{z}_{t},zt∣ℱt∼N(0,1),{z}_{t}| {{\mathcal{ {\mathcal F} }}}_{t}\hspace{0.33em} \sim \hspace{0.33em}N\left(0,1),and the dynamics of the volatility σt2{\sigma }_{t}^{2}follows (Eq. (5.7)).5.3Granger causality tests for sentiment and return volatilitiesThe analysis and tests for Granger causality is similar to the discussion in Section 3. The Granger causality tests are provided for the volatility of the standardized returns and the adjusted sentiment measures for different categories as discussed in Section 5.2. The results of the tests are as follows.The results in Table 7 indicate that the volatility of the FTSE 100 index return appears to be Granger caused by the volatility of all the sentiment measures related to the index, while this is not the case for the S&P 500 return volatility. In addition, according to the results in Table 8, the volatility of the Affiliation sentiment measure related to the FTSE 100 index appears to be Granger caused by the index return volatility. The results in the tables further point to the absence of Granger causality between the return and sentiment volatility for the S&P 500.Table 7FF-statistics for the hypothesis H0{H}_{0}: the sentiment volatility does not Granger cause the asset return volatilityGanger causality: Volatility of stock index caused by sentiment volatility?Adj_PositiveAdj_ActiveAdj_StrongAdj_AffiliationS&P5000.200120.499540.56390.64188FTSE1001.503 × 10−07***6.994 × 10−08***8.891 × 10−08***1.263 × 10−07***Notes: ∗∗∗{}^{\ast \ast \ast }indicates the 1% significance.Table 8FF-statistics for the hypothesis H0{H}_{0}: the asset return volatility does not Granger cause the sentiment volatilityGanger Causality: Volatility of sentiment caused by stock index volatility?Adj_PositiveAdj_ActiveAdj_StrongAdj_AffiliationS&P5000.436770.77130.737710.87481FTSE1000.659010.280480.121790.014483∗∗∗0.01448{3}^{\ast \ast \ast }Notes: ∗∗∗{}^{\ast \ast \ast }indicates the 1% significance.6ConclusionThis paper has focused on the problems of quantifying subjective sentiment and the analysis of its use as a predictor for asset returns and prices. The study is based on a vast first amount of data (approximately 100 GB) acquired from Twitter using text mining and quantification of the sentiment according to the General Inquirer Categories of the Harvard psychological dictionary. The relation between the sentiment and the returns on the stock indices is analyzed using Granger causality tests and GARCH-type modeling.The results of the analysis indicate that the Twitter sentiment apparently has predictive power for the returns on the S&P 500 and FTSE 100 financial indices. The results of the study further indicate that the volatility of the sentiment measures related to the FTSE 100 index appears to Granger cause the index return volatility, while this is not the case for the S&P 500 index.An important problem that is left for further research is that of structural breaks in models for the relation and dependence between asset prices and returns and the sentiment, including the structural breaks due to the beginning of the on-going COVID-19 pandemic.The paper uses Harvard Psychological Dictionary for sentiment analysis; further research may focus on applications of more recent sentiment analysis methods using artificial neural networks and other machine learning, such as Bidirectional Encoder Representations from Transformers (BERT) technique for natural language processing.Due to the fact that the sentiment appears to Granger cause the returns on the financial indices considered, further analysis may also focus on predictive models incorporating the sentiment and other predictors, including the factors used in predictive regressions for financial returns and also on using the sentiment as a signal for trading. The analysis may be based on the widely used econometric methods as well as machine learning approaches.As is common in the literature dealing with the analysis of dependence between financial and economic time series exhibiting volatility clustering such as financial returns and foreign exchange rates (see, among others, Patton [29]), the analysis of Granger causality in the paper is based on estimated volatilities. Further research may focus on the development and the use of methods that account for the uncertainty introduced in the first stage of the analysis by volatility estimation. In particular, the use of robust tt-statistic approaches to inference under heterogeneity, dependence and heavy-tailedness developed by Ibragimov and Müller [21,22] (see also Section 3.3 in the study by Ibragimov et al. [23]), and their extensions appears to be perspective in the context of econometric inference using general two-stage procedures as the approaches do not require consistent estimation of limiting variances of estimators dealt with/their standard errors (see, in particular, Ibragimov and Müller [21] for the discussion of applicability and the properties of tt-statistic approaches in inference in two-stage instrumental variable regressions and general GMM models and Abduraimova [1], for applications of the approaches in IV regressions for the analysis of effects of contagion on the tail risk in complex financial networks). http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Dependence Modeling de Gruyter

Equity returns and sentiment

Dependence Modeling , Volume 10 (1): 18 – Jan 1, 2022

Loading next page...
 
/lp/de-gruyter/equity-returns-and-sentiment-WH7uG8xqmw
Publisher
de Gruyter
Copyright
© 2022 Zibin Huang and Rustam Ibragimov, published by De Gruyter
ISSN
2300-2298
eISSN
2300-2298
DOI
10.1515/demo-2022-0109
Publisher site
See Article on Publisher Site

Abstract

1IntroductionQuantifying subjective opinion and using it as a predictor of stock market returns and prices have become an important topic of research and empirical analysis in academia and the industry. According to the efficient-market hypothesis (EMH) developed by Fama (see Fama et al. [11], Fama [12], and the references therein), information and thus the news are driving sources of equity prices as an “asset price reflects all available information.”This paper focuses on the analysis of the question whether public sentiment provides valuable information that affects stock prices and on quantifying the significance of the effects of public opinion and sentiment for equity prices. The motivation for asking these questions is inspired by the research and the developments in behavioral economics and behavioral finance. In particular, research in these fields suggests that asset prices could be affected by human psychological or behavioral factors (Bollen et al. [4]). For instance, according to Gruhl et al. [17], Liu et al. [24], and Mishne and de Rijke [27], book, movie, and other products’ sales can be predicted by sentiment in social media such as blogs, Twitter posts, and so on. Hence, it is reasonable to assume that public feeling could affect the returns and prices of financial assets and indices. This is further consistent with the research in psychology, which indicates that emotion plays a significant role in human decision-making (Damasio [9]). Studies in behavioral finance and related fields also indicate that emotion and sentiment have a meaningful contribution to financial decisions and investors’ performance [25,28]. Moreover, research indicates that downward pressure on market prices is related to high media pessimism indicated by publications in the Wall Street Journal [33].To study the effect of public mood on the stock price, we need to find a reliable, representative, and accessible proxy that can be used to construct time series on sentiment measures. Large-scale surveys for obtaining public mood are impractical, not only they are a waste of resources, but also they have great difficulty in producing a time-sensitive data. On the other hand, Twitter, a popular social media website that was launched in 2006, has millions of posts per day. The average number of monthly active Twitter users is over 330 million. Users of Twitter come from a variety of backgrounds, including CEOs, analysts, as well as the users’ major component, the general public. Therefore, it is reasonable to choose the sentiment of Twitter posts as a proxy for the public mood [4]. Many papers in the literature have focused on the analysis of the relation between Twitter sentiment and stock market returns (see, among others, Behrendt and Schmidt [3], Corea [8], Groß-Kluß mann et al. [16], Mittal and Goel [26], Ranco et al. [30], Washha et al. [34].In the first stage, this paper investigates the relationship between prices and public mood for three major indexes, S&P 500, NASQAD, FTSE 100, and one corporation stock, Apple Inc, using a large database of collected posts on Twitter with the indices and the stock’ tickers as keywords. In particular, the database includes approximately 8,029,963 posts for each keyword (3,000 randomly picked Tweets in each day) from January 1, 2008 to April 1, 2016.The period considered thus does not include the beginning of the ongoing COVID-19 pandemic, and the analysis of structural breaks due to this and other shocks is beyond this paper and is left for future research.We use Granger Causality tests to investigate whether a change in public sentiment can cause a change in stock prices.In the second stage, we explore models that may be able to further explain the relationship between Twitter sentiment and the returns and prices on the S&P 500 and FTSE 100 indices. We employ and estimate Nonlinear Generalized AutoRegressive Conditional Heteroskedasticity (NGARCH) time series to explain and quantify the relationship and to model the effects of Twitter sentiment on volatility clustering in financial markets.At the third stage, we focus on the analysis of the effects of Twitter sentiment on market volatility using the fitted GARCH models and Granger causality tests.2Data acquisitionThis section discusses where and how the stock price and Twitter data are acquired and describes the methods for quantifying the sentiment in the text from Twitter posts and generating the data on sentiment measures that can be used in the analysis, tests, and model development.The data on stock prices for the assets considered are downloaded from Yahoo Finance.For the sentiment data, an R program has been developed by using Twitter Application Interface (API) to download an assigned and fixed number of Twitter posts (or tweets) with a particular keyword for each day throughout the time interval in consideration. In particular, we used the CRAN version (stable) twitteR libratry [14], which developed based on standard Twitter API (Twitter application access API at http://dev.twitter.com/). We used the “searchTwitter” function in this package to obtain data for a given date, keywords, maximum return tweets (the maximum number of returning tweets are limited by the API capability at the time), and other search strings. This program analyzed a total 100 GB of data from Twitter. Following Tetlock [33], to quantify sentiment, this paper uses various types of sentiment measures as suggested in the General Inquirer Categories in Harvard Psychological Dictionary [13].The general analysis procedure employed in the paper is depicted in the flow graph (Figure 1).Figure 1The analysis procedure.2.1Twitter mining algorithmThis section describes the algorithm that was used to convert Twitter text data into time series of different sentiment measures.First, for each day in the period from January 1, 2008 to January 4, 2016, a random sample of 3,000 Tweets was extracted. All Tweets on the same day were collected to form a large text file that was used as a proxy for public comments on Twitter. For each of the downloaded daily text file, all the punctuation and other symbols (e.g. “https://”) were removed to form a crude corpus. In the crude corpus, we applied a further filtration for removing any meaningless (for the purpose of sentiment quantification) words, such as “is,” “this,” etc., to form the final daily corpus.Second, such daily generated corpus was checked using the Harvard Psychological Dictionary’s General Inquirer Categories [13] with four broad classes Positive, Negative, Active, and Passive, and also their subclasses Affiliation, Hostile, Strong, and Weak, by counting the frequency of words in the corpus that fall into a particular category. Hence, for each day, eight values for the word frequency in each of the group were obtained from the collected Tweets. The process has been repeated for every day in time interval dealt with. Following the procedure, time series of the raw sentiment data from Twitter were generated.The example in Appendix A1 demonstrates how the algorithm works for randomly acquired five Tweets on a particular date. The Tweets considered in Appendix A1 are not sentimentally neutral and contain polarized oriented words such as “drop,” “Active,” “unable,” etc. that indicates their sentiment orientation to some extend. This observation also provides a logical justification for using the categorization method in the paper.Third, to get a time series for testing the Granger causality between the Twitter sentiment and the stock price, several different sentiment measures are used in the paper. The sentiment measures considered are inspired by the analysis by Zhang and Skiena [35] and are summarized in the following formulas, with #Positive and #Negative, etc. standing for the number of words in the positive, negative, and other corresponding sentiment categories. The polarity sentiment measure is defined as follows: (2.1)Polarity=#Positive−#Negative#Positive+#Negative.\hspace{0.1em}\text{Polarity}=\frac{\#\text{Positive}-\#\text{Negative}}{\#\text{Positive}+\#\text{Negative}\hspace{0.1em}}.Obviously, the Polarity measure is not guaranteed to always be positive similar to asset prices as the number of positive words in the Twitter posts considered is not necessarily larger than the word count of negative words. To ensure positivity of the sentiment measures considered, without loss of generality, Zhang and Skiena’ Polarity measures are modified as the “Relative positive” measures, which are defined as follows: (2.2)Relative_positive=#Positive#Positive+#Negative.\hspace{0.1em}\text{Relative}\text{\_}\text{positive}=\frac{\#\text{Positive}}{\#\text{Positive}+\#\text{Negative}\hspace{0.1em}}.In a similar way, we define Relative Affiliation, Relative Active, and Relative Strong measures as follows.(2.3)Relative_Active=#Active#Active+#Passive.\hspace{0.1em}\text{Relative}\text{\_}\text{Active}=\frac{\#\text{Active}}{\#\text{Active}+\#\text{Passive}\hspace{0.1em}}.The categories Affiliation and Strong are subclasses of the Positive and Active categories. Their relative sentiment measures are defined as follows: (2.4)Relative_Affiliation=#Affiliation#Positive+#Negative,\hspace{0.1em}\text{Relative}\text{\_}\text{Affiliation}=\frac{\#\text{Affiliation}}{\#\text{Positive}+\#\text{Negative}\hspace{0.1em}},(2.5)Relative_Strong=#Strong#Active+#Passive.\hspace{0.1em}\text{Relative}\text{\_}\text{Strong}=\frac{\#\text{Strong}}{\#\text{Active}+\#\text{Passive}\hspace{0.1em}}.3Granger causality analysisAs the sentiment data have been acquired, Granger causality tests were performed to investigate whether there is a causality between the Twitter sentiment and the stock price, which also gives a partial answer to the question on the relationship between these variables. The Ganger causality tests are applied to time series confirmed to be stationary II(0) processes.3.1(Non-)stationarity analysisFirst, we conduct augmented Dickey-Fuller (ADF) tests to investigate the presence of unit root in the processes considered. Testing for a unit root in a time series Xt{X}_{t}is based on the ordinary least squares (OLS) regression: Xt=β0+αt+δXt−1+γ1ΔXt−1…+γpΔXt−p+εt,{X}_{t}={\beta }_{0}+\alpha t+\delta {X}_{t-1}+{\gamma }_{1}\Delta {X}_{t-1}\ldots +{\gamma }_{p}\Delta {X}_{t-p}+{\varepsilon }_{t},where β0{\beta }_{0}is a constant, α\alpha is a time trend coefficient, and εt{\varepsilon }_{t}is the innovation process with zero mean. Under H0:δ=0{{\bf{H}}}_{{\bf{0}}}:\delta =0, the process is nonstationary, and Ha:δ<0{{\bf{H}}}_{{\bf{a}}}:\delta \lt 0corresponds to stationarity in Xt{X}_{t}(Dickey and Fuller [10], Ch. 17 in Hamilton [19] and Section 15.7 in Stock and Watson [31]).The results of the ADF tests indicate that all the processes considered, including all the time series of sentiment measures dealt with and the logarithms of stock prices, are unit root II(1) processes. The analysis and tests of Granger causality in the paper are therefore based on stationary I(0) first differences of the aforementioned processes. In other words, in the following section, we focus on testing of Granger causality between the changes of the log price and the changes in the Twitter sentiment measures considered.3.2Autoregressive distributed lag models in Granger causality testsA process Xt{X}_{t}is said to Granger cause a process Yt{Y}_{t}if the lags of Xt{X}_{t}have useful predictive content for forecasting Yt,{Y}_{t},above and beyond the other regressors, e.g., the lags of Yt{Y}_{t}itself, in the model (see, among others, Section 15.4 in [31]). The Granger causality test is usually carried out using autoregressive distributed lag (ADL) models (3.1)Yt=β0+∑j=1JβjYt−j+∑k=1KγkXt−k+ut.{Y}_{t}={\beta }_{0}+\mathop{\sum }\limits_{j=1}^{J}{\beta }_{j}{Y}_{t-j}+\mathop{\sum }\limits_{k=1}^{K}{\gamma }_{k}{X}_{t-k}+{u}_{t}.The Granger causality test is carried out using an FF-test on all the coefficients on the lags of Xt{X}_{t}. The null hypothesis in this test is H0:∀k∈{1,…,K},γk=0{{\bf{H}}}_{{\bf{0}}}:\forall k\in \left\{1,\ldots ,K\right\},{\gamma }_{k}=0, which equivalently means that Xt{X}_{t}is not a useful predictor of Yt,{Y}_{t},given the lags of the latter process. The alternative hypothesis in the test is Ha:∃k∈{1,…,K},γk≠0{{\bf{H}}}_{{\bf{a}}}:\exists k\in \left\{1,\ldots ,K\right\},\hspace{0.33em}{\gamma }_{k}\ne 0, which corresponds to the property that the lags of Xt{X}_{t}do have some useful predictive content for forecasting Yt,{Y}_{t},beyond the lags of Yt{Y}_{t}itself.Let Pt{P}_{t}denote the logarithm of the price of a stock/index considered and let Sem denote a sentiment measure. The test whether the Twitter sentiment Granger causes the stock price is based on the following model: (3.2)ΔPt=β0+∑j=1JβjΔPt−j+∑k=1KγkΔSemt−k+ut.\Delta {P}_{t}={\beta }_{0}+\mathop{\sum }\limits_{j=1}^{J}{\beta }_{j}\Delta {P}_{t-j}+\mathop{\sum }\limits_{k=1}^{K}{\gamma }_{k}\Delta {{\rm{Sem}}}_{t-k}+{u}_{t}.Similarly, the test that the stock price Granger causes the Twitter sentiment is based on the model (3.3)ΔSemt=β0+∑m=1MβmΔSemt−m+∑n=1NγnΔPt−n+εt.\Delta {{\rm{Sem}}}_{t}={\beta }_{0}+\mathop{\sum }\limits_{m=1}^{M}{\beta }_{m}\Delta {{\rm{Sem}}}_{t-m}+\mathop{\sum }\limits_{n=1}^{N}{\gamma }_{n}\Delta {P}_{t-n}+{\varepsilon }_{t}.3.3Determination of the number of lags using the BIC criterionAs shown in Eqs. (3.1)–(3.3), implementation of Granger causality tests requires determination of the number of lags J,K,M,NJ,K,M,Nfor both of the processes considered. We use the the Bayesian information criterion (BIC) for deterrmination of the number of lags of processes Xt{X}_{t}and Yt{Y}_{t}(ΔPt\Delta {P}_{t}and ΔSen\Delta {\rm{Sen}}) in the ADL models dealt with. More precisely, as usual, first, the number JJof lags in autoregressive (AR) models for the process Yt{Y}_{t}Yt=β0+∑j=1JβjYt−j+ut{Y}_{t}={\beta }_{0}+\mathop{\sum }\limits_{j=1}^{J}{\beta }_{j}{Y}_{t-j}+{u}_{t}is determined based on the BIC, and then the criterion is applied to determine the number of lags KKof the potential predictor Xt{X}_{t}in model (Eq. (3.1)) with the estimated number JJ.The results of the lag length selection on the basis of the BIC for Granger causality tests are provided in Appendix A2.3.4The results of Granger causality testsThe results of tests of Granger causality between the considered asset returns and the corresponding Twitter sentiment are provided in Tables 1 and 2.Table 1FF-statistics for the hypothesis H0{H}_{0}: the changes in sentiment do not Granger cause the asset returnsAnalyzed tweetsRelative positiveRelative activeRelative affiliationRelative strongS&P5009,116,5370.0024253∗∗∗0.002425{3}^{\ast \ast \ast }0.010023∗∗0.01002{3}^{\ast \ast }0.01043∗∗0.0104{3}^{\ast \ast }0.0106∗∗0.010{6}^{\ast \ast }FTSE1009,099,0000.012089∗∗0.01208{9}^{\ast \ast }0.022854∗∗0.02285{4}^{\ast \ast }0.017246∗∗0.01724{6}^{\ast \ast }0.011853∗∗0.01185{3}^{\ast \ast }NASDAQ8,826,7750.858840.441550.838460.64976AAPL7,608,0000.403840.868420.268750.13011Notes: ∗∗∗{}^{\ast \ast \ast }indicates the 1% significance and ∗∗{}^{\ast \ast }indicates the 5% significance.Table 2FF-statistics for the hypothesis H0{H}_{0}: the asset returns do not Granger cause the changes in sentimentAnalyzed tweetsRelative positiveRelative activeRelative affiliationRelative strongS&P5009,116,5370.888780.476150.234480.94569FTSE1009,099,0000.927180.330620.570010.80284NASDAQ8,826,7750.756280.664840.540520.91209AAPL7,608,0000.866360.623850.673290.69855Similar to Section 3.2, the null hypotheses in the tables are that the changes in sentiment do not Granger cause the change in the log prices, that is, the returns, and vice versa.The results in Table 2 indicate that, somewhat surprisingly, that the changes in (log) prices apparently do not Granger cause the Twitter sentiment. These conclusions are in contrast to the conventional belief that the changes in asset prices affect public sentiment.Further, the results in Tables 1 and 2 point to the conclusion that the change of Twitter sentiment related to S&P 500 and FTSE 100 indices Granger causes their price changes but not vice versa. In particular, according to Table 1, among all the sentiment measures and the assets considered, the effect of the Relative Positive measure on S&P 500 returns appears to be the most significant, with significance of the test statistics at the 1% level. In contrast, the returns on the NASDAQ index and the Apple stock appear not to be Granger caused by the respective Twitter sentiment of price.3.5S&P 500: Granger causality tests using the big dataTo further evaluate and confirm the results on Granger causality in the previous section, we conduct the tests of Granger causality between the S&P 500 returns and the respective sentiment measures using a very large-scale database.Different from the first stage analysis (3,000 randomly selected with target keywords, by limiting the maximum return tweet in twitteR: searchTwitter function), in this section, we do not give a limit to the maximum return tweets in twitteR: searchTwitter function, just go for the max number that Twitter application access API can provide in one request. Tweet with the indices and the stock tickers as keywords are applied in the search. Also, The twitteR Library are based on Twitter Application access API. The Twitter Application access API not only limit the maximum number of tweets in each request but also limit a certain amount requests in a time period with a given IP. To acquire large data for analysis, We have registered multiple accounts (Hence multiple tokens) and switch IP each time when a request were sent. The analysis is based on all the acquirable data from the Twitter API for the index in the time period considered. The analysis is not conducted for the FTSE 100 index as there are much fewer posts related to it as compared to the S&P 500. As indicated earlier, for the S&P 500, as many data as possible were extracted and analyzed for each day in the time period dealt with, with approximately 14,005 Tweets per day (this number is a mathematical average estimation based on all obtained data with the target keywords) and 42,478,072 Tweets in total.The general observation was that the number of obtained tweets with corresponding keyword in each day is increasing with time. This is consisted with the fact that tweet is getting more and more well known and more people are posting their thoughts on tweet over time. For example, some keys have only 5,000 tweets per day in 2008, but obtained tweets number per day gets more in the follow year. Eventually the obtained daily number of tweets by a specific keyword is limited by the Twitter API we used.The results of Granger causality tests for the S&P 500 returns and the sentiment measures using the large-scale database are provided in the second rows of Tables 3 and 4. For comparison, we also provide, in the first rows of the tables, the Granger causality test statistics for the S&P 500 from the previous section.Table 3FF-statistics for the hypothesis H0{H}_{0}: the changes in sentiment do not Granger cause the S&P 500 returnsAnalyzed tweetsRelative positiveRelative activeRelative affiliationRelative strong1st attempt9,116,5370.0024253∗∗∗0.002425{3}^{\ast \ast \ast }0.010023∗∗0.01002{3}^{\ast \ast }0.01043∗∗0.0104{3}^{\ast \ast }0.0106∗∗0.010{6}^{\ast \ast }2nd attempt42,478,0720.0054962∗∗∗0.005496{2}^{\ast \ast \ast }0.0042141∗∗∗0.004214{1}^{\ast \ast \ast }0.0036191∗∗∗0.003619{1}^{\ast \ast \ast }0.0086103∗∗∗0.008610{3}^{\ast \ast \ast }Notes: ∗∗∗{}^{\ast \ast \ast }indicates the 1% significance and ∗∗{}^{\ast \ast }indicates the 5% significance.Table 4FF-statistics for the hypothesis H0{H}_{0}: the S&P 500 returns do not Granger cause the change in sentimentAnalyzed tweetsRelative positiveRelative activeRelative affiliationRelative strong1st attempt9,116,5370.888780.476150.234480.945692nd attempt42,478,0720.787770.100340.673030.72055The results in Tables 3 and 4 using the large-scale data confirm the results in the previous section that the Twitter sentiment related to the S&P 500 index appears to Granger cause its returns and (log) price changes but not vice versa.In conclusion, the returns and the prices of the S&P 500 and FTSE 100 indices appear to be Granger caused by the public sentiment. On the other hand, according to the results in this and the previous section, the changes in prices of the assets considered appear not to Granger cause the respective sentiment.4Causality modeling: ADL and GARCH modelsAs discussed in the previous section, the Twitter sentiment appears to Granger cause the returns on the S&P 500 and FTSE 100 indices. In this section, we focus on the analysis of models for the relationship between the returns on the indices and the respective sentiment. In particular, we evaluate the ADL models for the relationship and further fit GARCH time series to model the effects of sentiment volatility on market volatility.4.1Volatility clusteringWe first focus on the estimation of ADL models for the relationship between the returns on the S&P 500 and FTSE 100 indices and the lags of the sentiment measures. Similar to the analysis in the previous section, following the results in Section 3.1, the models are estimated for the stationary changes in the log prices – the returns – and the stationary changes in the measures of sentiment dealt with.The estimated ADL models include all the sentiment measures that were shown in the previous section to Granger cause the returns on the indices. The estimated models thus have the following form: (4.1)ΔPt=β0+∑k=1KβkΔPt−k+∑i=1IζiΔPost−i+∑j=1JγjΔActt−j+∑l=1LηlΔAfft−l+∑m=1MαmΔStrt−m+εt,\Delta {P}_{t}={\beta }_{0}+\mathop{\sum }\limits_{k=1}^{K}{\beta }_{k}\Delta {P}_{t-k}+\mathop{\sum }\limits_{i=1}^{I}{\zeta }_{i}\Delta {{\rm{Pos}}}_{t-i}+\mathop{\sum }\limits_{j=1}^{J}{\gamma }_{j}\Delta {{\rm{Act}}}_{t-j}+\mathop{\sum }\limits_{l=1}^{L}{\eta }_{l}\Delta {{\rm{Aff}}}_{t-l}+\mathop{\sum }\limits_{m=1}^{M}{\alpha }_{m}\Delta {{\rm{Str}}}_{t-m}+{\varepsilon }_{t},where {ΔPt}\left\{\Delta {P}_{t}\right\}, {ΔPost}\left\{\Delta {{\rm{Pos}}}_{t}\right\}, {ΔActt}\left\{\Delta {{\rm{Act}}}_{t}\right\}, {ΔAfft}\left\{\Delta {{\rm{Aff}}}_{t}\right\}, and {ΔStrt}\left\{\Delta {{\rm{Str}}}_{t}\right\}are the time series of the change in the logarithm of the prices of the indices, and the time series of the changes in the Relative Positive, Relative Active, Relative Affiliation and Relative Strong sentiment measures. The form of the ADL models is motivated by accounting for different classes and subclasses of sentiment in Harvard Psychological Dictionary’s General Inquirer Categories [13] and related measures in Section 2.1 used in the sentiment analysis in the paper. The numbers I,J,LI,J,L, and MMof lags included in the models are those determined by the BIC criterion as discussed in Section 3.3 and Appendix A2.ADL models (4.1) estimated by the OLS result in a poor fit for the time series of the returns on both the S&P 500 and FTSE 100 indices. Further, the plots of the residuals from the ADL regressions point to pronounced volatility clustering in the errors in the estimated linear models.The results on the poor fit of linear models for the returns and the presence of volatility clustering in the ADL regression errors and the returns are in accordance with the well known stylized facts of the absence of linear dependence and the presence of nonlinear dependence in financial returns (see Cont [7]).Following the results, in the next section, we thus focus on the models capturing the volatility clustering in the ADL model errors and the returns on the indices considered.4.2Modeling Granger causality and volatility clustering: ADL models with NGARCH errorsTo model Granger causality between the sentiment and the returns on the S&P 500 and FTSE 100 indices accounting for volatility clustering in the returns, as usual, we employ GARCH-type time series. As is well known, GARCH-type processes can be used to capture and model the most of the stylized facts of financial returns, including the absence of linear autocorrelations, the presence of volatility clustering and autocorrelations in squared returns, heavy-tailedness and conditional heavy-tailedness, and the leverage effect (see, among others, Alberg et al. [2], Christoffersen [6], and Cont [7]). ℱt{{\mathcal{ {\mathcal F} }}}_{t}denotes the filtration that contains all the information up to time tt, and N(0,1)N\left(0,1)and t(ν)t\left(\nu )denote the standard normal and (heavy-tailed) Student-ttdistribution with ν\nu degrees of freedom, respectively.The following ADL models with NGARCH errors exhibiting the aforementioned stylized facts are estimated using the maximum likelihood (ML): (4.2)ΔPt=ω0+∑k=1KωkΔPt−k+∑i=1IζiΔPost−i+∑j=1JγjΔActt−j+∑l=1LηlΔAfft−l+∑m=1MαmΔStrt−m+ut,\Delta {P}_{t}={\omega }_{0}+\mathop{\sum }\limits_{k=1}^{K}{\omega }_{k}\Delta {P}_{t-k}+\mathop{\sum }\limits_{i=1}^{I}{\zeta }_{i}\Delta {{\rm{Pos}}}_{t-i}+\mathop{\sum }\limits_{j=1}^{J}{\gamma }_{j}\Delta {{\rm{Act}}}_{t-j}+\mathop{\sum }\limits_{l=1}^{L}{\eta }_{l}\Delta {{\rm{Aff}}}_{t-l}+\mathop{\sum }\limits_{m=1}^{M}{\alpha }_{m}\Delta {{\rm{Str}}}_{t-m}+{u}_{t},where (4.3)ut=εtσt,εt∣ℱt−1∼N(0,1)ort(v),{u}_{t}={\varepsilon }_{t}{\sigma }_{t},\hspace{1em}{\varepsilon }_{t}| {{\mathcal{ {\mathcal F} }}}_{t-1}\hspace{0.33em} \sim \hspace{0.33em}N\left(0,1)\hspace{0.33em}\hspace{0.1em}\text{or}\hspace{0.1em}\hspace{0.33em}t\left(v),with i.i.d. innovations εt{\varepsilon }_{t}that have a standard normal or Student-ttdistributions, and the volatility dynamics is given by the NGARCH model in the following form: (4.4)σt2=α0+α1(ut−1−κ1)2+β1σt−12.{\sigma }_{t}^{2}={\alpha }_{0}+{\alpha }_{1}{\left({u}_{t-1}-{\kappa }_{1})}^{2}+{\beta }_{1}{\sigma }_{t-1}^{2}.The NGARCH specification for the errors in ADL models for index returns considered accounts for the properties of absence of linear autocorrelations, volatility clustering, heavy-tailedness, and leverage effect in returns time series.The models impose stationary, that is, the condition α1+β1<1,{\alpha }_{1}+{\beta }_{1}\lt 1,on the GARCH parameters.The results of the ML estimation of the aforementioned models are provided in the following sections.4.2.1S&P 500As shown in the results in Table 5, unlike linear ADL models estimated by the OLS, the ADL models with NGARCH errors described in the previous section provide an exceptional fit for the S&P 500 returns.Table 5AGARCH model for S&P500Error distributionNormal NN(0,1)Student-ttCoefficienttt-ProbCoefficienttt-ProbΔ\Delta Price−0.0512350-0.05123500.014∗0.01{4}^{\ast }−0.0228679-0.02286790.082∗0.08{2}^{\ast }Δ\Delta Rel_Positive685.6120.088∗0.08{8}^{\ast }539.484<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Δ\Delta Rel_Active−328.132-328.1320.594−108.797-108.797<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Δ\Delta Rel_Affiliation73.63100.854192.964<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Δ\Delta Rel_Strong−234.034-234.0340.361−188.109-188.109<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Constant0.721184<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }0.567255<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }α0{\alpha }_{0}2.843040.002∗∗∗0.00{2}^{\ast \ast \ast }7.785900.006∗∗∗0.00{6}^{\ast \ast \ast }α1{\alpha }_{1}0.0674871<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }0.0789915<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }β1{\beta }_{1}0.917239<0.001∗∗\lt 0.00{1}^{\ast \ast }0.921008<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Asymmetry (κ1{\kappa }_{1})Degree of freedom (Student-tt)N/AN/A2.37587≪0.001∗∗∗\ll 0.00{1}^{\ast \ast \ast }Notes: ∗∗∗{}^{\ast \ast \ast }indicates the 1% significance, ∗∗{}^{\ast \ast }indicates the 5% significance and ∗{}^{\ast }indicates the 10% significance.The results in Table 5 further confirm that the changes in sentiment measures are useful predictors of the changes of the index prices and returns. In particular, in the case of the ADL models with NGARCH errors and heavy-tailed Student-ttinnovations, the lags of the changes in all the sentiment measures appear to be highly significant, with the corresponding pp-values less than 0.001. Further, even in the case of ADL-NGARCH model errors with standard normal innovations, one of the sentiment measures, Relative Positive, exhibits statistical significance in predictive models for the S&P 500 returns.4.2.2FTSE 100Similar to the S&P 500 case, the estimation results in Table 6 for predictive ADL regression models for FTSE 100 returns with NGARCH errors demonstrate statistical significance of the changes in the sentiment measures.Table 6AGARCH Model for FTSE 100Error distributionNormal NN(0,1)Student-ttCoefficienttt-ProbCoefficienttt-ProbΔ\Delta Price−0.0447849-0.04478490.020∗∗0.02{0}^{\ast \ast }−0.0204596-0.02045960.150Δ\Delta Rel_Positive−4356.07-4356.07<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }360.002<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Δ\Delta Rel_Active9606.08<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }1836.06<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Δ\Delta Rel_Affiliation4863.51<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }149.458<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Δ\Delta Rel_Strong−1255.61-1255.61<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }−1380.03-1380.03<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Constant−0.343865-0.3438650.6780.8242100.081∗0.08{1}^{\ast }α0{\alpha }_{0}4.83313×10−1154.83313\times 1{0}^{-115}≪0.001∗∗∗\ll 0.00{1}^{\ast \ast \ast }1.90373×10−141.90373\times 1{0}^{-14}≪0.001∗∗∗\ll 0.00{1}^{\ast \ast \ast }α1{\alpha }_{1}0.0512398<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }0.111713≪0.001\ll 0.001β1{\beta }_{1}0.918548<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }0.888287<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Asymmetry (κ1{\kappa }_{1})40.3643<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }52.3062<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Degree of freedom (Student-tt)N/AN/A2.33052<0.001∗∗∗\lt 0.00{1}^{\ast \ast \ast }Notes: ∗∗∗{}^{\ast \ast \ast }indicates the 1% significance and ∗∗{}^{\ast \ast }indicates the 5% significance.The results in Table 6 indicate statistical significance of the regressors, including all the sentiment measures considered, in the predictive ADL models for FTSE 100 returns both in the case of normal and nonnormal heavy-tailed Student-ttinnovations in NGARCH models for the regression errors. Similar to the previous section, the results confirm predictive power of the sentiment measures for prediction of the returns and further confirm the presence of volatility clustering and other stylized facts in the ADL regression errors and the returns dealt with, the properties not captured by ADL models estimated by the OLS.5Causality between asset price volatility and sentiment volatilityThe results in Sections 3 and 4 indicate the presence of volatility clustering in the errors from the predictive ADL models, with the dynamics that can be modeled using NGARCH time series. In this section, we focus on the analysis of causality between the volatilities of the returns and sentiment processes. Similar to Patton [29], the analysis is based on Granger causality tests using the residuals from fitting GARCH-type models to both of the processes considered.5.1Models for causality between volatilitiesConsider two time series {Xt}\left\{{X}_{t}\right\}and {Yt}\left\{{Y}_{t}\right\}and, as mentioned earlier, denoted by ℱt{{\mathcal{ {\mathcal F} }}}_{t}the filtration containing the information up to time tt. The analysis of causality between the volatilities of the processes is based on Granger causality tests for innovations-residuals zt{z}_{t}and zt′{z}_{t}^{^{\prime} }from the GARCH processes fitted to {Xt}\left\{{X}_{t}\right\}and {Yt}:\left\{{Y}_{t}\right\}:(5.1)Xt=σtzt,zt∣ℱt∼N(0,1),{X}_{t}={\sigma }_{t}{z}_{t},\hspace{1.0em}{z}_{t}| {{\mathcal{ {\mathcal F} }}}_{t}\hspace{0.33em} \sim \hspace{0.33em}N\left(0,1),(5.2)Yt=δtzt′,zt′∣ℱt∼N(0,1){Y}_{t}={\delta }_{t}{z}_{t}^{^{\prime} },\hspace{1.0em}{z}_{t}^{^{\prime} }| {{\mathcal{ {\mathcal F} }}}_{t}\hspace{0.33em} \sim \hspace{0.33em}N\left(0,1)and (5.3)σt2=w+α⋅Xt−12+β⋅σt−12+εt,{\sigma }_{t}^{2}=w+\alpha \cdot {X}_{t-1}^{2}+\beta \cdot {\sigma }_{t-1}^{2}+{\varepsilon }_{t},(5.4)δt2=w′+α′⋅Yt−12+β′⋅δt−12+ηt.{\delta }_{t}^{2}={w}^{^{\prime} }+{\alpha }^{^{\prime} }\cdot {Y}_{t-1}^{2}+{\beta }^{^{\prime} }\cdot {\delta }_{t-1}^{2}+{\eta }_{t}.More precisely, the estimates of the GARCH parameters are obtained using maximum likelihood estimation (MLE), and then tests of Granger causality are conducted for the GARCH model residuals/standardized processes εˆt=Xt/σˆt{\hat{\varepsilon }}_{t}={X}_{t}\hspace{0.1em}\text{/}\hspace{0.1em}{\hat{\sigma }}_{t}and ηˆt=Yt/δˆt{\hat{\eta }}_{t}={Y}_{t}\hspace{0.1em}\text{/}\hspace{0.1em}{\hat{\delta }}_{t}. Granger casality testing described in Section 3 is used to investigate whether there is a causality between {εt}\left\{{\varepsilon }_{t}\right\}and {ηt}\left\{{\eta }_{t}\right\}. In particular, if the tests indicate that {εt}\left\{{\varepsilon }_{t}\right\}Granger causes {ηt}\left\{{\eta }_{t}\right\}, then this implies that the information contains in the past volatility of {Xt}\left\{{X}_{t}\right\}is useful for forecasting the volatility of {Yt}\left\{{Y}_{t}\right\}. In the following analysis, the approach is applied to the time series {Xt}\left\{{X}_{t}\right\}and {Yt}\left\{{Y}_{t}\right\}being the processes of asset returns and the measures of Twitter sentiment considered.5.2Data preparationThe analysis of Granger causality between volatilities is based on standardized returns and sentiment measures.More precisely, given the observations on a time series (e.g., that of the returns or the sentiment measures) {Xt}\left\{{X}_{t}\right\}, we consider its standardized version: (5.5)STD({Xt})≔Xt−X¯sX,∀t,{\rm{STD}}\left(\left\{{X}_{t}\right\}):= \left\{\frac{{X}_{t}-\bar{X}}{{s}_{X}},\hspace{1.0em}\forall t\right\},where, as usual, X¯\bar{X}and sX{s}_{X}denote the sample mean and standard deviation of the time series observations.The analysis is based on the standardized time series STD({rt}){\rm{STD}}\left(\left\{{r}_{t}\right\})and STD({Semt}){\rm{STD}}\left(\left\{{{\rm{Sem}}}_{t}\right\})for the returns and sentiment processes {rt}\left\{{r}_{t}\right\}and {Semt}\left\{{{\rm{Sem}}}_{t}\right\}, respectively.A few (approximately 5 out of 3,500) large outliers are observed in the standardized sentiment measure time series STD({Semt}){\rm{STD}}\left(\left\{{{\rm{Sem}}}_{t}\right\}). The presence of the outliers may be due to a relatively large number of reposts of polarized sentiment-oriented Tweets. This is similar to observations and the discussion in Tetlock [32] on some news having textual similarity with others. In the case of the Twitter, outliers caused by reposts may not be representing the real sentiment, as people can actually write their own comments along with their reposts. Some of the people’s comments along with their reposts could have a sentiment that is totally opposite to the reposted Tweets. Further, the presence of such large outliers may severely affect the fitting of GARCH models, in part due to the analyzed sentiment measures being squared in the analysis.To deal with the outliers, we make the assumption that the maximum change in the standardized sentiment is the same as the maximum change in the standardized return and estimate the following model with GARCH errors (see also Carnero et al. [5]): (5.6)Semt=k⋅Semt⋅1{∣Semt∣>max({∣rt∣})}+ut,{{\rm{Sem}}}_{t}=k\cdot {{\rm{Sem}}}_{t}\cdot {1}_{\left\{| {{\rm{Sem}}}_{t}| \gt \max \left(\left\{| {r}_{t}| \right\})\right\}}+{u}_{t},ut=σtzt,zt∣ℱt−1∼N(0,1){u}_{t}={\sigma }_{t}{z}_{t},\hspace{1em}{z}_{t}| {{\mathcal{ {\mathcal F} }}}_{t-1}\hspace{0.33em} \sim \hspace{0.33em}N\left(0,1)and (5.7)σt2=α0+α1ut−12+β1σt−12+εt.{\sigma }_{t}^{2}={\alpha }_{0}+{\alpha }_{1}{u}_{t-1}^{2}+{\beta }_{1}{\sigma }_{t-1}^{2}+{\varepsilon }_{t}.The estimation results in Appendices 4 and 5 indicates the coefficient kkbeing statistically significant and equal to 1 for all the sentiment measures considered. Hence, we further estimate the GARCH model for the following process: Semt⋅(1−1{∣Semt∣>max({∣rt∣})})=σtzt,{{\rm{Sem}}}_{t}\cdot \left(1-{1}_{\left\{| {{\rm{Sem}}}_{t}| \gt \max \left(\left\{| {r}_{t}| \right\})\right\}})={\sigma }_{t}{z}_{t},zt∣ℱt∼N(0,1),{z}_{t}| {{\mathcal{ {\mathcal F} }}}_{t}\hspace{0.33em} \sim \hspace{0.33em}N\left(0,1),and the dynamics of the volatility σt2{\sigma }_{t}^{2}follows (Eq. (5.7)).5.3Granger causality tests for sentiment and return volatilitiesThe analysis and tests for Granger causality is similar to the discussion in Section 3. The Granger causality tests are provided for the volatility of the standardized returns and the adjusted sentiment measures for different categories as discussed in Section 5.2. The results of the tests are as follows.The results in Table 7 indicate that the volatility of the FTSE 100 index return appears to be Granger caused by the volatility of all the sentiment measures related to the index, while this is not the case for the S&P 500 return volatility. In addition, according to the results in Table 8, the volatility of the Affiliation sentiment measure related to the FTSE 100 index appears to be Granger caused by the index return volatility. The results in the tables further point to the absence of Granger causality between the return and sentiment volatility for the S&P 500.Table 7FF-statistics for the hypothesis H0{H}_{0}: the sentiment volatility does not Granger cause the asset return volatilityGanger causality: Volatility of stock index caused by sentiment volatility?Adj_PositiveAdj_ActiveAdj_StrongAdj_AffiliationS&P5000.200120.499540.56390.64188FTSE1001.503 × 10−07***6.994 × 10−08***8.891 × 10−08***1.263 × 10−07***Notes: ∗∗∗{}^{\ast \ast \ast }indicates the 1% significance.Table 8FF-statistics for the hypothesis H0{H}_{0}: the asset return volatility does not Granger cause the sentiment volatilityGanger Causality: Volatility of sentiment caused by stock index volatility?Adj_PositiveAdj_ActiveAdj_StrongAdj_AffiliationS&P5000.436770.77130.737710.87481FTSE1000.659010.280480.121790.014483∗∗∗0.01448{3}^{\ast \ast \ast }Notes: ∗∗∗{}^{\ast \ast \ast }indicates the 1% significance.6ConclusionThis paper has focused on the problems of quantifying subjective sentiment and the analysis of its use as a predictor for asset returns and prices. The study is based on a vast first amount of data (approximately 100 GB) acquired from Twitter using text mining and quantification of the sentiment according to the General Inquirer Categories of the Harvard psychological dictionary. The relation between the sentiment and the returns on the stock indices is analyzed using Granger causality tests and GARCH-type modeling.The results of the analysis indicate that the Twitter sentiment apparently has predictive power for the returns on the S&P 500 and FTSE 100 financial indices. The results of the study further indicate that the volatility of the sentiment measures related to the FTSE 100 index appears to Granger cause the index return volatility, while this is not the case for the S&P 500 index.An important problem that is left for further research is that of structural breaks in models for the relation and dependence between asset prices and returns and the sentiment, including the structural breaks due to the beginning of the on-going COVID-19 pandemic.The paper uses Harvard Psychological Dictionary for sentiment analysis; further research may focus on applications of more recent sentiment analysis methods using artificial neural networks and other machine learning, such as Bidirectional Encoder Representations from Transformers (BERT) technique for natural language processing.Due to the fact that the sentiment appears to Granger cause the returns on the financial indices considered, further analysis may also focus on predictive models incorporating the sentiment and other predictors, including the factors used in predictive regressions for financial returns and also on using the sentiment as a signal for trading. The analysis may be based on the widely used econometric methods as well as machine learning approaches.As is common in the literature dealing with the analysis of dependence between financial and economic time series exhibiting volatility clustering such as financial returns and foreign exchange rates (see, among others, Patton [29]), the analysis of Granger causality in the paper is based on estimated volatilities. Further research may focus on the development and the use of methods that account for the uncertainty introduced in the first stage of the analysis by volatility estimation. In particular, the use of robust tt-statistic approaches to inference under heterogeneity, dependence and heavy-tailedness developed by Ibragimov and Müller [21,22] (see also Section 3.3 in the study by Ibragimov et al. [23]), and their extensions appears to be perspective in the context of econometric inference using general two-stage procedures as the approaches do not require consistent estimation of limiting variances of estimators dealt with/their standard errors (see, in particular, Ibragimov and Müller [21] for the discussion of applicability and the properties of tt-statistic approaches in inference in two-stage instrumental variable regressions and general GMM models and Abduraimova [1], for applications of the approaches in IV regressions for the analysis of effects of contagion on the tail risk in complex financial networks).

Journal

Dependence Modelingde Gruyter

Published: Jan 1, 2022

Keywords: sentiment; asset prices; asset returns; dependence; Granger causality; predictive regressions; autoregressive distributed lag models; GARCH models; volatility; 62P20; 91B84

There are no references for this article.