The Competitive Landscape of High-Frequency Trading Firms

The Competitive Landscape of High-Frequency Trading Firms Abstract We examine product differentiation in the high-frequency trading (HFT) industry, where the “products” are secretive proprietary trading strategies. We demonstrate how principal component analysis can be used to detect underlying strategies common to multiple HFT firms and show that there are three product categories with distinct attributes. We study how HFT competition in each product category affects the market environment and present evidence that indicates how it influences the short-horizon volatility of stocks as well as the viability of trading venues. Received October 10, 2016; editorial decision September 30, 2017 by Editor Itay Goldstein. Authors have furnished an Internet Appendix, which is available on the Oxford University Press Web site next to the link to the final published paper online. Industry competitiveness is an important theme in the economics of industrial organization. It is of particular interest when the industry in question plays a central role in the performance of our securities markets, which are essential to information production and allocation of capital. In this paper, we closely examine product differentiation and intra-industry competition in the high-frequency trading (HFT) industry using unique regulatory data from Canada. HFT firms use computerized algorithms for proprietary trading, and they engage in electronic market making, cross-trading venue price arbitrage, short-term statistical arbitrage, and various other opportunistic strategies.1 This subset of financial service firms is responsible for most of the order flow and about half of the trading in equity markets in both the United States and Canada. The dominant role played by HFT firms has generated concerns over both the fairness and stability of equity markets, and these concerns are fueled in part by the secrecy that surrounds the operations of these firms. Like Hoberg and Phillips (2016), our study is based on the idea that product similarity is germane to industry boundaries. Given that the “product” of an HFT firm is a proprietary (and secretive) trading strategy, we first grapple with how to define product categories and assess the number of firms competing in each one. Besanko et al. (2010) suggest identifying close substitutes by asking what a product does and when, where, and how it is used. We therefore identify HFT firms that are close substitutes by looking for strategies that do the same things at the same time in the same stocks. The more similar their activity, the more likely it is that they respond to the same stimuli and pursue the same profit opportunities. The industrial organization literature has long established that the more similar the products, the more competitive the environment. We invoke this “principle of differentiation” (Tirole 1988) to motivate measures of similarity in HFT strategies as proxies for the intensity of competition between HFT firms. We use these measures to study whether competition between HFT firms affects the short-horizon volatility of individual stocks, and to examine the relationship between HFT competition and market concentration. Our investigation of these issues is facilitated by data from the Investment Industry Regulatory Organization of Canada (IIROC), which is the national self-regulatory organization that oversees all equity trading venues in Canada. These data enable us to observe HFT activity coming from dealers’ proprietary trading desks as well as from HFT firms that use direct market access (DMA) arrangements with dealers. Moreover, we can track the activity of each HFT firm across all trading venues in Canada. Our sample consists of S&P/TSX 60 Index stocks, and we analyze activity from 30 days that represent bullish, bearish, and neutral trading environments during a period that runs from June 2010 through March 2011. We use unmasked user and firm identities to characterize 31 trading firms as “high-frequency traders.” The strategies of these HFT firms are secretive insofar as we do not know their objectives, but we can measure their output: orders, cancellations, and trades. Each strategy creates a unique profile of orders, cancellations, and trades, but it is impossible to understand the nature behind any one strategy simply by observing millions of data points. We use principal component analysis (PCA) to classify these profiles into product categories by decomposing the correlation matrix of the occurrences of orders, cancellations, and trades and construct the principal components that capture meta-profiles common to multiple HFT firms. Our analysis suggests that there are at least three distinct underlying common strategies (or product categories), each followed by several HFT firms. Seventeen HFT firms do not appear to pursue one of these common strategies, but the 14 firms that follow the common strategies represent most of the HFT activity, accounting for 96.21% of the messages that HFT firms send to the market and 78.97% of the volume they trade. Therefore, we find that competing HFT firms in three differentiated product categories generate most of the HFT activity in the Canadian market. The PCA does not attach any economic interpretation to the underlying common strategies it identifies. We therefore analyze the principal component scores by regressing them on various attributes of the market environment. Our results could be interpreted to suggest that one of the principal components represents a cross-venue arbitrage strategy, another one involves market making, and a third is related to short-horizon directional speculation. While we stress that these labels are merely suggested by patterns of regression coefficients (they do not constitute definitive characterizations of these strategies), the important takeaway from our analysis is that there is clear product heterogeneity in this industry. Therefore, we believe it may be more constructive to discuss the impact of HFT on the market in the context of specific strategies rather than in the aggregate. Given the aforementioned “principle of differentiation,” we construct a measure of similarity in trading strategies within each product category as a proxy for the competitiveness of the product market. While competition has desirable properties, if it means that HFT firms engage in strategies that are highly correlated across stocks, the central role these firms play in our markets could exacerbate return movements and increase short-horizon stock volatility. We study whether this is the case, and our investigation uncovers the opposite result: the short-horizon volatility of most stocks loads negatively on the extent of similarity in strategies of HFT firms that follow the underlying common strategy that we associate with market making. We investigate potential drivers of this relationship and find that portions of short-horizon volatility related to both the permanent and temporary price impacts of trades drive the negative relationship with HFT competition. In other words, our results suggest that competition between HFT intermediaries in the spirit of Jovanovic and Menkveld (2016) could lead to faster revelation of hard information signals, lowering adverse selection costs and hence reducing volatility induced by the permanent price impact of trades. Competition in market making may also lower the compensation HFT market makers earn, which in turn explains the reduction in volatility that stems from the temporary price impact. These results indicate that HFT competition may bring dual benefits to the market in the form of potential lower rents for the services HFT firms provide as well as lower volatility. We stress, however, that our analysis reflects normal market conditions, and this conclusion may not hold during an intense market breakdown, such as during the Flash Crash in May of 2010 in the United States. We also investigate how competition between HFT firms relates to competition between trading venues in the market. Brokers are required to provide their customers with best execution and to search all venues on which a security is traded to find the most beneficial terms of trading, with an emphasis on the best price. Therefore, the trading venues displaying the best prices or smallest spreads attract more order flow. HFT firms play a central role in setting prices and spreads, and hence could affect competition between trading venues. We show that concentration of trading in the market is negatively related to similarity in strategies of HFT firms that follow two of the three underlying common strategies: cross-venue arbitrage and market making. We study one channel through which HFT competition affects market concentration by considering whether greater HFT competition on each trading venue increases the percentage of time that the venue features the best prices or the narrowest spreads, which we view as measures of the viability or competitiveness of the venue. We find that smaller trading venues are more viable while the dominant trading venue is less viable when HFT competition is more intense, which is likely one of the drivers behind the negative relationship we document between HFT competition and market concentration. We contribute to the literature by presenting new insights into the nature of competition in the HFT industry and its effects on the market. Our most important contribution is the ability to define product categories in the HFT industry despite the complete lack of information about the nature of each firm’s strategy and identify firms that directly compete with one another in each category. Our use of PCA to investigate product differentiation is novel and could potentially be utilized by others in similar contexts. We also provide new insights into the impact of HFT competition within each product category on the market environment, documenting evidence that it lowers the short-horizon volatility of stocks and influences market concentration by enhancing the viability of smaller trading venues. Finally, we use our unique data to study several publicly available measures to identify those that can best capture general HFT activity as well as the specific underlying common strategies we document in the paper. 1. Literature Review Our paper joins a rapidly expanding body of literature on HFT in financial markets. For recent surveys on the topic of HFT, see Jones (2013), Goldstein, Kumar, and Graves (2014), and Menkveld (2016). Among the theoretical contributions are those of Han, Khapko, and Kyle (2014), Hoffmann (2014), Biais, Foucault, and Moinas (2015), Ait-Sahalia and Sağlam (2016), Jovanovic and Menkveld (2016), Jovanovic and Menkveld (2016), and Rosu (2016). In particular, Jarrow and Protter (2012) show that, when HFT firms respond to common signals, their correlated activity affects market prices, thereby increasing market volatility and generating abnormal profits for these firms. While we indeed show that multiple HFT firms pursue correlated strategies, we find that the short-horizon volatility of most stocks loads negatively, not positively, on a measure of similarity between HFT strategies. Budish, Cramton, and Shim (2015) and Menkveld and Zoican (2017) model HFT firms that pursue heterogeneous strategies in the market. We find empirical evidence for such heterogeneity. Many empirical contributions focus on intraday analysis of aggregate HFT behavior (e.g., Carrion 2013; Hasbrouck and Saar 2013; Brogaard, Hendershott, and Riordan 2014; Jarnecic and Snape 2014; Hirschey 2016; Kirilenko et al. 2017). Several papers use data on trading by individual HFT firms, rather than aggregate behavior, to investigate HFT strategies (Baron, Brogaard, and Kirilenko 2012; Hagströmer and Norden 2013; Hagströmer, Nordén, and Zhang 2014; Benos and Sagade 2016), although they predominantly focus on whether HFT firms demand or supply liquidity. Clark-Joseph (2014) uses index futures data from the CMA to examine exploratory trading by HFT firms. In studies more closely related to our analysis, Breckenfelder (2013) and Brogaard and Garriott (2016) examine one aspect of competition: the entry and exit of HFT firms. Specifically, Brogaard and Garriott (2016) analyze data from one alternative trading system in Canada and show that new entrants take volume away from incumbents even as they increase the overall market share of HFT firms, and these effects decline with each successive entry. They also find that market liquidity improves after the entry of an HFT firm and deteriorates after an exit (especially when only one or two HFT firms trade in a given stock). Breckenfelder (2013) uses data from the Stockholm Stock Exchange and finds the opposite result: deterioration of liquidity for entries of HFT firms and improvement for exits. Menkveld (2013) examines the trading of one HFT firm and makes the case that this particular firm enhances the viability of a new trading venue, which is related to our result that competition between HFT firms is positively related to the viability of smaller trading venues.2Hagströmer and Norden (2013) find that HFT firms that predominantly supply liquidity appear to mitigate intraday volatility, which complements our finding pertaining to the relationship between HFT competition and volatility, although they do not address the competition dimension.3 Lastly, our paper joins several other papers that use Canadian order-level data to investigate HFT-related issues. Malinova, Park, and Riordan (2013) study the cost of trading around changes in market structure that affect mainly HFT firms and other algorithmic traders. Comerton-Forde, Malinova, and Park (Forthcoming) study the nature of liquidity provision around changes in dark trading regulation, and Korajczyk and Murphy (2017) examine HFT liquidity provision to large institutional trades. Brogaard, Hendershott, and Riordan (2016) study how limit orders and trades of high-frequency traders contribute to price discovery. 2. The Canadian Market: Sample, Data, and HFT Firms Our data come from the Investment Industry Regulatory Organization of Canada (IIROC). All trading venues in Canada are required to provide data feeds to IIROC, which performs both real-time and post-trade market surveillance of trading activities. Directly connecting to trading venues in Canada requires an IIROC membership, and IIROC admits only security dealers as members. Other financial firms, such as asset managers, banks, insurance companies, and proprietary trading firms, can trade through dealers’ brokerage arms or via direct market access (DMA) arrangements provided by dealers. DMA arrangements allow nondealer trading firms to access trading venues without having to hand over their orders to brokers for execution. During our sample period (June 2010 through March 2011), Canada has five trading venues organized as electronic limit-order books that trade stocks listed on the Toronto Stock Exchange: Alpha ATS Limited Partnership (ALF), Chi-X Canada ATS (CHX), Omega ATS (OMG), Pure Trading (PTX), and the Toronto Stock Exchange (TSX).4 Trading on crossing networks (“dark trading”) in Canada during our sample period is essentially limited to one dark pool (MATCH Now) with no more than a 3% market share.5 2.1 Sample We carry out the empirical work on 30 trading days that are selected to capture variation across market conditions. We rank the daily returns of the S&P/TSX Composite Index from June 2010 through March 2011, and select the 10 worst days (down days), the 10 best days (up days), and the 10 days closest to (and centered on) zero return (flat days). This design allows us to examine whether similarities in the strategies of HFT firms depend on market conditions (as summarized by the daily return on a broad index). Our sample stocks consist of 52 constituents of the S&P/TSX 60 Index, which accounts for approximately 73% of Canada’s total equity market capitalization. Eight stocks are excluded from the Index because they converted from income trusts to corporate structures (five), listed for less than one year (one), delisted (one), or changed symbols (one). Panel A of Table 1 presents summary statistics for the sample.6 The mean market capitalization is 19.4 billion Canadian dollars (CAD), with an average stock price of 39.1 CAD, and average daily volume of 78.2 million CAD. Panel A clearly shows that our sample period encompasses three distinct market conditions: the average daily returns of stocks on down, flat, and up days are $$-$$1.72%, $$-$$0.08%, and 1.66%, respectively. Table 1 Summary statistics A. Sample stocks Days Stocks MktCap (Million CAD) Price (CAD) StdRet (% 30min ret) CADVolume (1000; daily) Return (% daily) All S&P 60 Mean 19,412 39.1 0.40 78,156 –0.04 Median 11,401 36.9 0.40 51,120 –0.04 Down S&P 60 Mean 19,203 38.3 0.43 81,195 –1.72 Median 11,299 36.4 0.43 55,892 –1.56 Flat S&P 60 Mean 19,499 39.7 0.34 70,614 –0.08 Median 11,840 37.4 0.34 50,078 –0.07 Up S&P 60 Mean 19,535 39.5 0.39 82,903 1.66 Median 11,574 37.0 0.37 54,097 1.46 A. Sample stocks Days Stocks MktCap (Million CAD) Price (CAD) StdRet (% 30min ret) CADVolume (1000; daily) Return (% daily) All S&P 60 Mean 19,412 39.1 0.40 78,156 –0.04 Median 11,401 36.9 0.40 51,120 –0.04 Down S&P 60 Mean 19,203 38.3 0.43 81,195 –1.72 Median 11,299 36.4 0.43 55,892 –1.56 Flat S&P 60 Mean 19,499 39.7 0.34 70,614 –0.08 Median 11,840 37.4 0.34 50,078 –0.07 Up S&P 60 Mean 19,535 39.5 0.39 82,903 1.66 Median 11,574 37.0 0.37 54,097 1.46 B. HFT firms HFT firms MktShr (% volume) MktShr (% trades) Trades (daily) Sub/Can (daily) Messages (daily) Ord/Trd ratio CrossEnd inventory CrsVenue activity All Mean 1.50 1.70 19,445 1,056,838 1,063,974 92.7 14.9 42,775 Med. 0.34 0.44 5,035 97,146 97,288 40.4 2.4 619 MS1 Mean 6.58 8.95 102,035 5,466,547 5,496,423 49.5 73.0 216,133 Med. 7.23 8.60 98,048 3,272,863 3,312,989 42.2 70.1 177,952 MS2 Mean 2.67 2.08 23,730 982,137 995,868 38.3 13.4 38,307 Med. 2.79 2.11 24,003 147,090 162,729 6.9 4.3 1,821 MS3 Mean 0.19 0.22 2,489 238,236 239,157 116.5 4.3 9,667 Med. 0.05 0.04 714 40,129 40,268 71.4 1.8 168 B. HFT firms HFT firms MktShr (% volume) MktShr (% trades) Trades (daily) Sub/Can (daily) Messages (daily) Ord/Trd ratio CrossEnd inventory CrsVenue activity All Mean 1.50 1.70 19,445 1,056,838 1,063,974 92.7 14.9 42,775 Med. 0.34 0.44 5,035 97,146 97,288 40.4 2.4 619 MS1 Mean 6.58 8.95 102,035 5,466,547 5,496,423 49.5 73.0 216,133 Med. 7.23 8.60 98,048 3,272,863 3,312,989 42.2 70.1 177,952 MS2 Mean 2.67 2.08 23,730 982,137 995,868 38.3 13.4 38,307 Med. 2.79 2.11 24,003 147,090 162,729 6.9 4.3 1,821 MS3 Mean 0.19 0.22 2,489 238,236 239,157 116.5 4.3 9,667 Med. 0.05 0.04 714 40,129 40,268 71.4 1.8 168 Our sample consists of 52 stocks from the S&P/TSX60 Index. We rank the daily returns of the S&P/TSX Composite Index from June 2010 through March 2011 and select the 10 worst days (down days), the 10 best days (up days), and the 10 days closest to and centered on zero return (flat days) for a 30-day sample period. Panel A presents summary statistics for the sample stocks: market capitalization, price, the standard deviation of 30-minute returns, daily volume (in Canadian dollars), and daily return. Panel B presents summary statistics for the 31 high-frequency trading (HFT) firms that we identify using data from the Investment Industry Regulatory Organization of Canada (IIROC). These 31 firms are further categorized into three subgroups according to market share of volume: MS1 (market share of at least 4%; 4 firms), MS2 (market share of between 1% and 4%; 6 firms), and MS3 (the rest of the HFT firms). Market share (in terms of volume or trades) is computed by dividing the trading undertaken by each HFT firm by total trading in the market. Trades consist of executions of both marketable orders and nonmarketable orders. Sub/Canc is the number of (nonmarketable) limit-order submissions and cancellations (i.e., all nonexecution messages) that a firm sends to the market, where a modification of an order counts as a cancellation and a resubmission. Messages are all the orders (both marketable and nonmarketable) and cancellations that a firm sends to the market. The orders/trades ratio is defined for each HFT firm as messages divided by trades. The mean order-to-trade ratio is the average of the order-to-trade ratios of the individual HFT firms, and therefore due to its nonlinear nature and the heterogeneity of the firms is not equal to the cross-sectional mean number of messages divided by the cross-sectional mean number of trades. CrossEndInventory is the number of times per day that an HFT firm’s intraday inventory position crosses its end-of-day inventory position. CrsVenueActivity is the number of daily 10-millisecond time stamps in which an HFT firm is both (1) sending messages (orders or cancellations) to TSX and (2) sending messages to another trading venue in the same or adjacent time-stamp. We compute the measures for each HFT firm using all days in our sample period and then provide in the table cross-firm means and medians for all HFT firms as well as for subgroups by market share. Table 1 Summary statistics A. Sample stocks Days Stocks MktCap (Million CAD) Price (CAD) StdRet (% 30min ret) CADVolume (1000; daily) Return (% daily) All S&P 60 Mean 19,412 39.1 0.40 78,156 –0.04 Median 11,401 36.9 0.40 51,120 –0.04 Down S&P 60 Mean 19,203 38.3 0.43 81,195 –1.72 Median 11,299 36.4 0.43 55,892 –1.56 Flat S&P 60 Mean 19,499 39.7 0.34 70,614 –0.08 Median 11,840 37.4 0.34 50,078 –0.07 Up S&P 60 Mean 19,535 39.5 0.39 82,903 1.66 Median 11,574 37.0 0.37 54,097 1.46 A. Sample stocks Days Stocks MktCap (Million CAD) Price (CAD) StdRet (% 30min ret) CADVolume (1000; daily) Return (% daily) All S&P 60 Mean 19,412 39.1 0.40 78,156 –0.04 Median 11,401 36.9 0.40 51,120 –0.04 Down S&P 60 Mean 19,203 38.3 0.43 81,195 –1.72 Median 11,299 36.4 0.43 55,892 –1.56 Flat S&P 60 Mean 19,499 39.7 0.34 70,614 –0.08 Median 11,840 37.4 0.34 50,078 –0.07 Up S&P 60 Mean 19,535 39.5 0.39 82,903 1.66 Median 11,574 37.0 0.37 54,097 1.46 B. HFT firms HFT firms MktShr (% volume) MktShr (% trades) Trades (daily) Sub/Can (daily) Messages (daily) Ord/Trd ratio CrossEnd inventory CrsVenue activity All Mean 1.50 1.70 19,445 1,056,838 1,063,974 92.7 14.9 42,775 Med. 0.34 0.44 5,035 97,146 97,288 40.4 2.4 619 MS1 Mean 6.58 8.95 102,035 5,466,547 5,496,423 49.5 73.0 216,133 Med. 7.23 8.60 98,048 3,272,863 3,312,989 42.2 70.1 177,952 MS2 Mean 2.67 2.08 23,730 982,137 995,868 38.3 13.4 38,307 Med. 2.79 2.11 24,003 147,090 162,729 6.9 4.3 1,821 MS3 Mean 0.19 0.22 2,489 238,236 239,157 116.5 4.3 9,667 Med. 0.05 0.04 714 40,129 40,268 71.4 1.8 168 B. HFT firms HFT firms MktShr (% volume) MktShr (% trades) Trades (daily) Sub/Can (daily) Messages (daily) Ord/Trd ratio CrossEnd inventory CrsVenue activity All Mean 1.50 1.70 19,445 1,056,838 1,063,974 92.7 14.9 42,775 Med. 0.34 0.44 5,035 97,146 97,288 40.4 2.4 619 MS1 Mean 6.58 8.95 102,035 5,466,547 5,496,423 49.5 73.0 216,133 Med. 7.23 8.60 98,048 3,272,863 3,312,989 42.2 70.1 177,952 MS2 Mean 2.67 2.08 23,730 982,137 995,868 38.3 13.4 38,307 Med. 2.79 2.11 24,003 147,090 162,729 6.9 4.3 1,821 MS3 Mean 0.19 0.22 2,489 238,236 239,157 116.5 4.3 9,667 Med. 0.05 0.04 714 40,129 40,268 71.4 1.8 168 Our sample consists of 52 stocks from the S&P/TSX60 Index. We rank the daily returns of the S&P/TSX Composite Index from June 2010 through March 2011 and select the 10 worst days (down days), the 10 best days (up days), and the 10 days closest to and centered on zero return (flat days) for a 30-day sample period. Panel A presents summary statistics for the sample stocks: market capitalization, price, the standard deviation of 30-minute returns, daily volume (in Canadian dollars), and daily return. Panel B presents summary statistics for the 31 high-frequency trading (HFT) firms that we identify using data from the Investment Industry Regulatory Organization of Canada (IIROC). These 31 firms are further categorized into three subgroups according to market share of volume: MS1 (market share of at least 4%; 4 firms), MS2 (market share of between 1% and 4%; 6 firms), and MS3 (the rest of the HFT firms). Market share (in terms of volume or trades) is computed by dividing the trading undertaken by each HFT firm by total trading in the market. Trades consist of executions of both marketable orders and nonmarketable orders. Sub/Canc is the number of (nonmarketable) limit-order submissions and cancellations (i.e., all nonexecution messages) that a firm sends to the market, where a modification of an order counts as a cancellation and a resubmission. Messages are all the orders (both marketable and nonmarketable) and cancellations that a firm sends to the market. The orders/trades ratio is defined for each HFT firm as messages divided by trades. The mean order-to-trade ratio is the average of the order-to-trade ratios of the individual HFT firms, and therefore due to its nonlinear nature and the heterogeneity of the firms is not equal to the cross-sectional mean number of messages divided by the cross-sectional mean number of trades. CrossEndInventory is the number of times per day that an HFT firm’s intraday inventory position crosses its end-of-day inventory position. CrsVenueActivity is the number of daily 10-millisecond time stamps in which an HFT firm is both (1) sending messages (orders or cancellations) to TSX and (2) sending messages to another trading venue in the same or adjacent time-stamp. We compute the measures for each HFT firm using all days in our sample period and then provide in the table cross-firm means and medians for all HFT firms as well as for subgroups by market share. 2.2 Data The order-level data we obtain from IIROC cover all Canadian trading venues, and contain information about order submissions, cancellations, modifications, and executions with 10-millisecond time stamps. The time stamps from all trading venues are synchronized with the regulator’s time stamps and reflect the local time at which a message (a general term used for submissions, cancellations, and executions of orders) is processed. The record of each message contains the following information: ticker symbol, order side (buy or sell), trading venue, price, total quantity, nondisplayed quantity, broker ID, user ID, order type (e.g., client orders, inventory trading), various flags (e.g., short sell, market on close, opening trade), and order/trade ID. Trade messages are identified as buyer-initiated or seller-initiated. Events in an order’s life, including modification, partial fill, full fill, and cancellation have the same order ID. We provide a more complete description of the data in the Online Appendix. One advantage of our data is that the same user IDs are used on all trading venues in Canada. Most HFT firms in Canada do not operate as licensed dealers but rather gain access through DMA arrangements with one or multiple dealers. We obtain tables that identify the user IDs of all trading firms that use DMA arrangements. Hence, we can accurately detect the activity of each HFT firm on all trading venues regardless of whether it trades via multiple DMA arrangements with dealers or uses multiple user IDs. For dealers who may have brokerage arms in addition to proprietary trading operations, we are able to distinguish any order or trade made in the capacity of an agent. We therefore include in our measurement of these dealers’ HFT activity only those orders and trades identified in the order type field as proprietary activity. 2.3 Identifying HFT firms Researchers have used a variety of methods to identify HFT activity, dictated mostly by the quality and attributes of available data. Some papers use publicly available data to identify aggregate HFT activity (e.g., Hasbrouck and Saar 2013), whereas others have used data with masked user IDs in conjunction with quantitative criteria (e.g., end-of-day inventory that is less than 20% of daily volume, see Brogaard, Hendershott, and Riordan (2016). The method considered most accurate for identifying HFT firms requires knowledge of the trading entities. The advantage of our data is that they include unmasked user and firm identities, hence providing us with an unparalleled ability to observe the HFT firms.7 Our data contain hundreds of trading firms, which makes it difficult to investigate every one of them at length. We therefore rank trading firms along several dimensions simply as a way to flag candidates for more thorough analysis. The dimensions we use are number of trades, number of messages, orders-to-trades ratio, the number of times a firm’s intraday inventory positions cross the end-of-day position (or zero), and cross-trading-venue activity in the same or neighboring time stamps. For example, if a firm has almost no proprietary trading and executes only a handful of trades every day, we believe that it is not engaged in what is generally perceived as HFT and therefore does not merit additional attention. We do not set a formal cutoff for any of these dimensions because our classification of trading firms is not quantitative in nature but rather manual; the quantitative analysis simply eliminates firms with little relevant activity.8 It is important to stress that we manually examine many firms—those that were eventually identified as engaging in HFT as well as other firms—to ensure that we capture all relevant entities. For each of those firms, we use its name to search for information about the firm’s business in media news reports and on the Web. For example, some firms are mentioned in newspaper articles as engaging in HFT. Other firms have Web sites on which they explicitly state that they pursue high-frequency strategies. Still others are listed as participants in TSX’s Electronic Liquidity Providers (ELP) program, which offers fee incentives to HFT firms that provide liquidity.9 We use this information to ensure that we identify HFT firms even if they do not rank highly along some of the quantitative dimensions. Our procedure enables us to identify 31 firms as HFT firms. Some of these firms have DMA arrangements with dealers while others are dealers engaged in proprietary trading. Our goal in this identification procedure is to ensure that we study as complete a set of HFT firms as possible. Since we have both user IDs and a mapping of the user IDs to the firms, the first choice we need to make is whether to operate at the user-ID level or aggregate all user IDs that belong to the same firm. There are no rules (to the best of our knowledge) that guide how firms utilize user IDs.10 One possible concern is that firms assign multiple user IDs to the order flow of an algorithm and send orders via DMA arrangements with multiple dealers to make it more difficult for outsiders to detect their activity. Routing via multiple dealers could also be driven by the desire to limit dependence on one dealer. We choose to work at the HFT-firm level (i.e., to aggregate all user IDs belonging to the same firm) to make our analysis robust to whatever gaming could be going on in terms of the firm’s discretionary assignment of user IDs to its orders. The 31 HFT firms we identify send orders via 572 user IDs during our sample period.11 It is important to stress that, for broker-dealers, these IDs are used solely for proprietary trading. Figure 1 plots the number of user IDs for each of the 31 HFT firms (sorted from the largest to the smallest). The mean number of user IDs per HFT firm is 18.45 and the median is 4. The large dispersion in the number of user IDs per firm is clearly seen in the figure: one firm sends orders through 137 IDs, whereas 6 HFT firms operate through a single ID. The correlation between the HFT firms’ market share in terms of volume traded and the number of IDs they use is 0.362 (i.e., firms with a larger market share operate through more IDs). However, the number of IDs is essentially uncorrelated (0.015) with market share in terms of the number of messages a firm sends to the market. If an HFT firm operates multiple algorithms and actually designates a separate user ID to each algorithm, our procedure will lump them together, although the principal component analysis we implement in Section 3 could potentially differentiate these separate strategies.12 Figure 1 View largeDownload slide HFT firms: Number of user IDs In this figure, we plot the number of active user IDs for each of the 31 high-frequency trading (HFT) firms (sorted from the largest to the smallest). Our data come from the Investment Industry Regulatory Organization of Canada (IIROC). The record of each message contains a user ID, and we have a mapping from user IDs to trading firms. We identify 31 HFT firms. Some of these firms have direct market access (DMA) arrangements with dealers while others are dealers engaged in proprietary trading. We choose to work at the HFT-firm level (i.e., to aggregate all user IDs belonging to the same firm) to make our analysis robust to whatever gaming could be occurring in terms of a firm’s discretionary assignment of user IDs to its orders. The 31 HFT firms we identify send orders via 572 user IDs during our sample period. Figure 1 View largeDownload slide HFT firms: Number of user IDs In this figure, we plot the number of active user IDs for each of the 31 high-frequency trading (HFT) firms (sorted from the largest to the smallest). Our data come from the Investment Industry Regulatory Organization of Canada (IIROC). The record of each message contains a user ID, and we have a mapping from user IDs to trading firms. We identify 31 HFT firms. Some of these firms have direct market access (DMA) arrangements with dealers while others are dealers engaged in proprietary trading. We choose to work at the HFT-firm level (i.e., to aggregate all user IDs belonging to the same firm) to make our analysis robust to whatever gaming could be occurring in terms of a firm’s discretionary assignment of user IDs to its orders. The 31 HFT firms we identify send orders via 572 user IDs during our sample period. Panel B of Table 1 provides summary statistics for the 31 HFT firms as well as for categories formed on market share of volume. Specifically, MS1 consists of four firms with market shares greater than 4%, MS2 consists of six firms with market shares between 1% and 4%, and MS3 consists of the rest of the firms. Overall, these 31 HFT firms have 46.4% of the market share in terms of volume. The average number of daily trades of a firm in our sample is 19,445, but it ranges from 102,035 for firms in MS1 to 2,489 for firms in MS3. Similarly, the number of daily messages a firm sends to the market (submissions and cancellations of nonmarketable limit orders as well as the number of marketable orders) is 1,063,974, ranging from 5,496,423 for firms in MS1 to 239,157 on average for firms in MS3. The mean (median) number of times the intraday inventory position of a firm in our sample crosses its end-of-day inventory increases with market share: 4.3 (1.8) for MS3, 13.4 (4.3) for MS2, and 73.0 (70.1) for MS1. Similarly, the number of 10-millisecond time stamps in which an HFT firm is active both on the primary market (TSX) and on another trading venue increases with market share. However, it is interesting to note that the orders/trades ratio does not share the same pattern, with the highest mean and median (116.5 and 71.4, respectively) characterizing the smaller HFT firms. 3. HFT Competition and Product Differentiation 3.1 Differentiation, competition, and correlated activity The HFT industry is shrouded in secrecy. Most HFT firms are private and hence reveal no financial or operating information; detailed information about proprietary trading desks of larger, publicly listed firms are seldom disclosed to the public. In general, HFT firms do not reveal information about the objectives of their algorithms beyond speaking generally about concepts such as “liquidity provision” and “arbitrage.” How can one investigate the extent of competition in such an industry? Examining the economic profits of firms would be ideal, but the costs associated with hiring and retaining the individuals who develop the algorithms as well as other operating costs are simply unavailable. Researchers can only ascertain firms’ net trading revenues (or what is left when shares are bought and sold) from data sets like ours, but trading revenues without costs cannot be used to make a correct inference about economic profits or the competitiveness of an industry.13 Our first goal is to study whether there are product categories in the HFT industry such that each product category is characterized by multiple HFT firms that pursue similar strategies. Hoberg and Phillips (2016), for example, design a new classification scheme to define industries using text-based analysis of product descriptions. Besanko et al. (2010) define products that are close substitutes as having the same or similar product performance characteristics (i.e., what they do) and having the same or similar occasions for use (i.e., when, where, and how they are used). These criteria motivate the manner in which we identify HFT strategies that are close substitutes: these strategies submit, cancel, or execute orders at the same time in the same stocks. The activity of each HFT firm consists of the messages (e.g., submission and cancellation of limit orders) it sends to the market to implement its strategy. The higher the correlation between the activities of two HFT firms, the greater is the similarity (or the lower is the differentiation) in their proprietary trading strategies. In other words, the higher the correlation, the more likely it is that the firms pursue the same profit opportunities and respond to the same trading signals. Hence, we use the correlation between the manifestations of strategies (e.g., the messages HFT firms send to the market) as a measure of product similarity. In particular, we decompose the correlation matrix in the activities of the HFT firms using principal component analysis to establish the main product categories and determine how many HFT firms operate in each one. According to Aldridge (2013), the secrecy in the HFT industry stems from purely competitive business considerations. She notes that every strategy has a finite capacity, and a competitor is bound to significantly diminish the profitability of an HFT strategy. As a result, when multiple firms carry out the same strategy, each firm’s profits go down. The industrial organization literature has long recognized the “principle of differentiation,” according to which firms want differentiated products to soften competition and increase their profits (Tirole 1988). Hotelling (1929) showed in the context of two firms that the more similar the products, the closer one is to Bertrand competition, and this result has been generalized in many other papers (see, e.g., Salop 1979 for the case of $$n$$ firms). In general, Shy (1995) states that in both Bertrand and Cournot games with differentiated products, the firms’ profits increase when the products become more differentiated (Propositions 7.1 and 7.2). In both cases, product differentiation softens competition; the more similar the products, the more competitive the environment. Strategy differentiation for HFT firms helps alleviate the capacity problem and increases profitability in the same way that product differentiation in the aforementioned industrial organization models helps alleviate price competition and increases profits. The more similar the strategies, the more intensely HFT firms compete. In Section 4 we use measures of strategy similarity as proxies for competition to investigate how HFT competition affects the market environment. One conceptual issue we want to clarify is that we use the terms “strategy” and “activity” somewhat interchangeably. While a strategy is typically a plan of action, we do not know the specifics of the algorithms employed by each HFT firm. We can measure only the outcomes of a strategy, although it is natural to recognize that actions we observe are manifestations of a strategy, and hence to treat measures of these actions as representing the strategy. Are strategy and activity necessarily the same? One can presumably construct examples in which they diverge, such as when two distinct strategies produce the same action in one particular state of the world. For example, market making and aggressive news trading could both be associated with a high level of activity when there is an outburst of news (Foucault, Hombert, and Roşu 2016). However, these two strategies would have completely different patterns of activity in other states of the world (e.g., when there is no news). The strength of our methodology is that we draw on over 36 million observations to estimate the similarities in strategies: the PCA uses every 1-second interval during the market’s opening hours, for 30 days (which represent three distinct market conditions), in 52 stocks. Only if the strategies of multiple HFT firms produce similar actions across the many intervals and stocks would our methodology identify an underlying common strategy and point to the firms that follow it. Therefore, while our empirical methodologies exploit correlated activity, our inference is about similarity in the strategies of HFT firms. 3.2 Describing HFT activity The main variable we use to describe HFT firm activity in an interval, MSG, emphasizes actions initiated by an HFT firm, and is defined as the sum of three components: the number of submissions of nonmarketable limit orders, the number of cancellations of nonmarketable limit orders, and the number of marketable order executions. Hence, MSG for an interval (say one second) describes the total number of messages that an HFT firm sends to the market to initiate a change in its position (either in terms of presence in the limit order book or to transact immediately).14 We investigate two additional variables that summarize HFT activity to ensure the robustness of our conclusions. The first, TRD, is the number of trades made by an HFT firm in an interval. These trades could occur as the result of marketable orders’ submissions or the execution of previously submitted limit orders that rested in the book. The second, LMT, comprises all submissions and cancellations of nonmarketable limit orders, and hence describes the actions a firm takes in the interval solely to change its presence in the limit order book. To economize on the presentation of the results of the PCA in this section, we report only the results for MSG. In Section 4, where the regression and partial correlations analyses enable an efficient presentation of the results for the three measures, we report the results side by side. Our preference for MSG as the main measure is based in part on the fact that most HFT activity comprises submissions and cancellations of orders, not actual trading, and hence a measure that includes these submissions and cancellations would offer a more complete portrayal of their activity. This idea is also highlighted in Brogaard, Hendershott, and Riordan (2016) and Subrahmanyam and Zheng (2016), who stress the importance of HFT limit order behavior. In Section 5, we provide evidence that the similarity in the activity measures of HFT firms is much greater when we consider total activity (buys plus sells) rather than directional activity (buys minus sells). For this reason, we define MSG in terms of total activity. We also find that the strategies of HFT firms are similar in the down, flat, and up days, and therefore our main results are presented for the entire sample period rather than by market condition. 3.3 Principal components analysis Our first goal is to ascertain whether multiple HFT firms pursue similar strategies that define a particular product category, and to determine how many such product categories can be identified in the HFT industry in Canada. We use a data-driven methodology, principal component analysis (henceforth, PCA), to help us answer these questions. We use such a data-driven methodology because of the difficulty of explicitly characterizing the strategy of an HFT firm just by looking at millions of messages that each HFT firm is sending to the market every day. The space of possible signals to which an HFT firm could respond is enormous, and interaction between the strategies of multiple HFT firms means that a complete characterization may require examining the activities of all HFT firms simultaneously. Unfortunately, this is simply not something that a person can do by glancing at the data. Furthermore, for a researcher to study a strategy by hypothesizing a particular economic goal and devising an algorithm to carry out a successful strategy would likely require the same effort an HFT firm invests in developing an algorithm. Even if one were able to develop such an algorithm, the activity implied by the algorithm may not exactly correspond to any of the strategies currently employed by HFT firms active in the market. Hence, a more promising approach is to use a data-driven methodology to characterize similarities between the strategies of the HFT firms and categorize them into subgroups. In general, PCA may be used to discover and summarize the pattern of intercorrelations between variables. It helps in grouping variables that are correlated with one another, presumably because they are driven by the same underlying phenomenon. In our application, we think about the HFT firms’ activities as the “variables” that we seek to summarize, and the underlying phenomenon as the strategies that have the same objective and hence belong to the same product category. This analysis helps us understand HFT strategies in several ways. First, it tells us whether there are underlying common strategies that HFT firms follow. Second, it identifies the firms in each product category and indicates how close each firm’s strategy is to the underlying common strategies. Third, it enables analysis of the empirical representations of these underlying common strategies (i.e., the principal component scores) and exploration of how they relate to the market environment as a means of learning about the economic nature of the strategies. The input for the PCA is a matrix of HFT firms’ activities (MSG) with 31 columns (one for each HFT firm) and 36,504,000 rows (the number of stocks times the number of 1-second intervals in the sample period).15 In other words, the stocks are stacked one on top of another and we analyze the time-series and cross-sectional sources of variation in HFT activity together. As with any data-driven methodology, it is important to note the economic meaning of various implementation choices. For example, the PCA decomposes the correlation matrix (rather than the covariance matrix) of the strategies. In other words, the variables that describe the activity (or strategy) of each HFT firm are standardized by subtracting the mean activity of each HFT firm (across all stocks and time intervals) and dividing by its standard deviation. Standardizing the variables eliminates the possibility that one of them would dominate the procedure because it has a much higher scale or range. From an economic perspective, this means that our procedure gives the very active HFT firms the same weight as any other HFT firm (each contributing one unit of variance to the total variance). This helps us focus on similarity in strategies even if some firms are larger than others. The manner in which we pool all stocks for the estimation of the PCA means that it can take into account various strategies that have a “portfolio” flavor to them. For example, a strategy that involves market-wide signals (e.g., from an ETF) that generates activity in multiple stocks in the same intervals could easily be picked up by the procedure. When conducting the PCA, we must choose how many principal components to retain for subsequent analysis. The corresponding economic question is how many separate underlying strategies are common to a significant number of the HFT firms in our sample. We conduct a Scree test by plotting the eigenvalues of the principal components and looking for a natural break. Each of the 31 firms contributes $$100/31=3.2{\%}$$ of the variance. However, meaningful principal components would naturally explain more of the variance. We find that the first principal component explains 11.66% of the variance, the second 4.5%, and the third 4.02%, together accounting for over 20% of total variation. Additional principal components account for less than 4% of the variance each, and there seems to be a natural break after three components. Hence, we extract three principal components for subsequent analysis. The finding that about 20% of the variance is explained by the first three principal components reflects the economic nature of this industry. In some applications of PCA, researchers drop some of the variables in order for the procedure to explain more of the variance in the remaining set. In our implementation, this is counterproductive. It is important to stress that we do not use PCA to identify HFT firms, but rather we use our knowledge of the trading entities to determine the sample of HFT firms. We are therefore not interested in maximizing the variance explained; we want the methodology to reflect the actual intercorrelations in the set of all HFT firms that operate in Canada. We use the PCA to ascertain the number of underlying common strategies and identify the firms that follow each common strategy. Part of our contribution is to demonstrate the extent of heterogeneity in HFT strategies and that there are many firms with more unique strategies. Another choice we need to make when implementing the methodology involves determining the rotation of the principal components. The rotation is meant to help us interpret the loadings, which are the coefficients of each HFT firm on each of the principal components. We utilize the commonly used varimax orthogonal rotation. The loading of an HFT firm on a particular principal component using this rotation has a simple interpretation: it is equivalent to the bivariate correlation between the HFT firm’s activity and the underlying common strategy represented by the principal component. Our conclusions are robust to using other rotations. Table 2 presents the loadings from the PCA using 1-second intervals. Each principal component can be viewed as representing an underlying strategy that is common to multiple HFT firms and which results in the correlated behavior of the firms, and therefore we use the terms “underlying common strategy” and “principal component” interchangeably. The larger (i.e., closer to 1) the loading of a particular HFT firm on a principal component, the more similar the firm’s strategy to that underlying common strategy. For each principal component, the loadings are sorted from the most positive to the most negative, and the 31 firms are represented by F01 through F31. The largest loading on the first principal component, for example, belongs to firm F14 and signifies 0.76 correlation between the strategy of this firm and the underlying common strategy represented by the principal component. Table 2 Principal components analysis: Loadings HFT PC1 loading HFT PC2 loading HFT PC3 loading F14 0.76* F27 0.71* F17 0.51* F16 0.67* F08 0.52* F23 0.48* F04 0.57* F31 0.41* F19 0.47* F24 0.54* F05 0.40* F20 0.43* F31 0.48* F02 0.39* F26 0.33 F28 0.46* F28 0.38* F12 0.22 F20 0.40* F30 0.34 F08 0.20 F17 0.38* F06 0.27 F18 0.18 F29 0.33 F12 0.24 F14 0.15 F01 0.32 F26 0.22 F29 0.13 F21 0.26 F10 0.17 F21 0.12 F27 0.26 F01 0.16 F03 0.12 F06 0.26 F20 0.13 F10 0.08 F07 0.23 F04 0.12 F30 0.05 F26 0.23 F24 0.11 F05 0.03 F08 0.22 F09 0.11 F02 0.03 F11 0.15 F23 0.10 F06 0.03 F12 0.08 F11 0.07 F16 0.02 F19 0.07 F18 0.06 F25 0.01 F05 0.04 F14 0.05 F13 0.01 F22 0.03 F03 0.04 F27 0.00 F10 0.01 F15 0.03 F09 0.00 F15 0.01 F25 0.02 F22 –0.01 F09 0.00 F22 0.01 F15 –0.01 F25 0.00 F13 0.01 F04 –0.02 F03 0.00 F29 0.00 F28 –0.07 F13 0.00 F16 –0.03 F31 –0.09 F02 –0.04 F21 –0.07 F24 –0.13 F30 –0.06 F07 –0.12 F07 –0.13 F23 –0.07 F19 –0.12 F11 –0.13 F18 –0.08 F17 –0.16 F01 –0.30 HFT PC1 loading HFT PC2 loading HFT PC3 loading F14 0.76* F27 0.71* F17 0.51* F16 0.67* F08 0.52* F23 0.48* F04 0.57* F31 0.41* F19 0.47* F24 0.54* F05 0.40* F20 0.43* F31 0.48* F02 0.39* F26 0.33 F28 0.46* F28 0.38* F12 0.22 F20 0.40* F30 0.34 F08 0.20 F17 0.38* F06 0.27 F18 0.18 F29 0.33 F12 0.24 F14 0.15 F01 0.32 F26 0.22 F29 0.13 F21 0.26 F10 0.17 F21 0.12 F27 0.26 F01 0.16 F03 0.12 F06 0.26 F20 0.13 F10 0.08 F07 0.23 F04 0.12 F30 0.05 F26 0.23 F24 0.11 F05 0.03 F08 0.22 F09 0.11 F02 0.03 F11 0.15 F23 0.10 F06 0.03 F12 0.08 F11 0.07 F16 0.02 F19 0.07 F18 0.06 F25 0.01 F05 0.04 F14 0.05 F13 0.01 F22 0.03 F03 0.04 F27 0.00 F10 0.01 F15 0.03 F09 0.00 F15 0.01 F25 0.02 F22 –0.01 F09 0.00 F22 0.01 F15 –0.01 F25 0.00 F13 0.01 F04 –0.02 F03 0.00 F29 0.00 F28 –0.07 F13 0.00 F16 –0.03 F31 –0.09 F02 –0.04 F21 –0.07 F24 –0.13 F30 –0.06 F07 –0.12 F07 –0.13 F23 –0.07 F19 –0.12 F11 –0.13 F18 –0.08 F17 –0.16 F01 –0.30 This table presents the loadings from a principal component analysis of high-frequency trading (HFT) strategies. The representation of HFT activity that we use to characterize the strategies is MSG, which comprises all messages an HFT firm actively sends to the market in an interval (submission of nonmarketable limit orders, cancellation of nonmarketable limit orders, and marketable limit orders that result in trade executions). We conduct the analysis using 1-second intervals. For the purpose of the analysis, we view the 31 HFT firms as the “variables,” while the observations are all intervals during the 30-day sample period for all sample stocks. The principal component analysis uses the varimax orthogonal rotation, and the first three principal components are retained for further analysis. The loading of an HFT firm on each of the principal components signifies the extent to which the firm’s activity corresponds to the underlying common strategy represented by that principal component; it is equivalent to the bivariate correlation between the firm’s measure and the principal component and is therefore between $$-$$1 and 1. For each principal component, the loadings are sorted from the most positive to the most negative. An asterisk indicates all principal components that are $$\geqslant$$ 0.35. Table 2 Principal components analysis: Loadings HFT PC1 loading HFT PC2 loading HFT PC3 loading F14 0.76* F27 0.71* F17 0.51* F16 0.67* F08 0.52* F23 0.48* F04 0.57* F31 0.41* F19 0.47* F24 0.54* F05 0.40* F20 0.43* F31 0.48* F02 0.39* F26 0.33 F28 0.46* F28 0.38* F12 0.22 F20 0.40* F30 0.34 F08 0.20 F17 0.38* F06 0.27 F18 0.18 F29 0.33 F12 0.24 F14 0.15 F01 0.32 F26 0.22 F29 0.13 F21 0.26 F10 0.17 F21 0.12 F27 0.26 F01 0.16 F03 0.12 F06 0.26 F20 0.13 F10 0.08 F07 0.23 F04 0.12 F30 0.05 F26 0.23 F24 0.11 F05 0.03 F08 0.22 F09 0.11 F02 0.03 F11 0.15 F23 0.10 F06 0.03 F12 0.08 F11 0.07 F16 0.02 F19 0.07 F18 0.06 F25 0.01 F05 0.04 F14 0.05 F13 0.01 F22 0.03 F03 0.04 F27 0.00 F10 0.01 F15 0.03 F09 0.00 F15 0.01 F25 0.02 F22 –0.01 F09 0.00 F22 0.01 F15 –0.01 F25 0.00 F13 0.01 F04 –0.02 F03 0.00 F29 0.00 F28 –0.07 F13 0.00 F16 –0.03 F31 –0.09 F02 –0.04 F21 –0.07 F24 –0.13 F30 –0.06 F07 –0.12 F07 –0.13 F23 –0.07 F19 –0.12 F11 –0.13 F18 –0.08 F17 –0.16 F01 –0.30 HFT PC1 loading HFT PC2 loading HFT PC3 loading F14 0.76* F27 0.71* F17 0.51* F16 0.67* F08 0.52* F23 0.48* F04 0.57* F31 0.41* F19 0.47* F24 0.54* F05 0.40* F20 0.43* F31 0.48* F02 0.39* F26 0.33 F28 0.46* F28 0.38* F12 0.22 F20 0.40* F30 0.34 F08 0.20 F17 0.38* F06 0.27 F18 0.18 F29 0.33 F12 0.24 F14 0.15 F01 0.32 F26 0.22 F29 0.13 F21 0.26 F10 0.17 F21 0.12 F27 0.26 F01 0.16 F03 0.12 F06 0.26 F20 0.13 F10 0.08 F07 0.23 F04 0.12 F30 0.05 F26 0.23 F24 0.11 F05 0.03 F08 0.22 F09 0.11 F02 0.03 F11 0.15 F23 0.10 F06 0.03 F12 0.08 F11 0.07 F16 0.02 F19 0.07 F18 0.06 F25 0.01 F05 0.04 F14 0.05 F13 0.01 F22 0.03 F03 0.04 F27 0.00 F10 0.01 F15 0.03 F09 0.00 F15 0.01 F25 0.02 F22 –0.01 F09 0.00 F22 0.01 F15 –0.01 F25 0.00 F13 0.01 F04 –0.02 F03 0.00 F29 0.00 F28 –0.07 F13 0.00 F16 –0.03 F31 –0.09 F02 –0.04 F21 –0.07 F24 –0.13 F30 –0.06 F07 –0.12 F07 –0.13 F23 –0.07 F19 –0.12 F11 –0.13 F18 –0.08 F17 –0.16 F01 –0.30 This table presents the loadings from a principal component analysis of high-frequency trading (HFT) strategies. The representation of HFT activity that we use to characterize the strategies is MSG, which comprises all messages an HFT firm actively sends to the market in an interval (submission of nonmarketable limit orders, cancellation of nonmarketable limit orders, and marketable limit orders that result in trade executions). We conduct the analysis using 1-second intervals. For the purpose of the analysis, we view the 31 HFT firms as the “variables,” while the observations are all intervals during the 30-day sample period for all sample stocks. The principal component analysis uses the varimax orthogonal rotation, and the first three principal components are retained for further analysis. The loading of an HFT firm on each of the principal components signifies the extent to which the firm’s activity corresponds to the underlying common strategy represented by that principal component; it is equivalent to the bivariate correlation between the firm’s measure and the principal component and is therefore between $$-$$1 and 1. For each principal component, the loadings are sorted from the most positive to the most negative. An asterisk indicates all principal components that are $$\geqslant$$ 0.35. While the choice of a cutoff for the magnitude above which a loading is considered economically significant is somewhat arbitrary, it helps to have some cutoff in mind when examining the results. We consider loadings to be significant from an economic standpoint if they are greater than or equal to 0.35, and mark them with an asterisk in the table. Eight HFT firms have significant loadings on the first principal component, suggesting that it represents a strategy that is common to these eight firms. Similarly, six firms exhibit significant loadings on the second principal component, ranging from 0.71 to 0.35, while four HFT firms exhibit significant loadings on the third principal component. Four firms appear to follow more than one strategy: F31 and F28 (significant loadings on the first and second principal components) and F17 and F20 (significant loadings on the first and third principal components).16 To summarize the discussion so far, the first important takeaway from the PCA is that 14 HFT firms compete in three underlying common strategies, with each underlying common strategy followed by multiple firms. On the other hand, 17 HFT firms do not appear to pursue any of these common strategies but rather pursue more unique strategies. How important are the common strategies relative to the unique strategies? The 14 firms that compete in one of the three common strategies represent most of the HFT activity: 96.21% of the messages that HFT firms send to the market and 78.97% of the volume they trade. Therefore, we find that the HFT firms that concentrate their activity in three distinct product categories generate most of the HFT activity. To obtain a deeper understanding of these product categories, we turn to characterizing how the underlying commons strategies represented by the principal components are related to the market environment. 3.4 Regressions on principal component scores The challenge in interpreting the results of a PCA lies in understanding the economic nature of the principal components. To gain additional insights, we use another output of the methodology: the component scores. Each principal component is essentially a linear combination of the observed variables (the 31 HFT firms’ activity variables), and the component scores are computed from these variables using the estimated loadings in Table 2 as weights. This creates a separate score for each principal component in each stock and time interval. We use regressions to examine the relationships between these component scores (as dependent variables) and the market environment. One advantage of using the principal component scores for this purpose is that they are constructed from the estimated loadings and activity variables of all HFT firms. Hence, our analysis of the economic meaning of the underlying common strategies is robust to whatever choice we make on the cutoff that defines which loadings are considered economically significant. The first three explanatory variables in our regression specification provide information about the HFT firms. We are interested in characterizing the extent to which HFT firms operate simultaneously across trading venues. The first variable, HFTcrossmsg, counts messages sent only by HFT firms that submit messages to multiple trading venues in the same interval. Similarly, we are interested in the degree to which HFT firms trade passively by supplying liquidity. The second variable, HFTliqsupply, counts the executed limit orders of all HFT firms in each interval. The third variable, |HFTinventory|, is the absolute value of the aggregate inventory position of all HFT firms in Canadian dollars (cumulative from the beginning of the day and assuming that all of them start the day with zero inventory). The next two explanatory variables represent the degree of integration of the Canadian market: PriceAlign is the percentage of time that the three trading venues with the highest market share post the same bid or ask price, whereas SpreadAlign is the percentage of time that these three trading venues have the same bid-ask spread. The next three explanatory variables represent the state of liquidity in the market (aggregated across all trading venues): total depth up to 10 cents from the market-wide best bid or offer (MWBBO), the magnitude of depth imbalance at the MWBBO (defined as the absolute value of the difference between the number of shares at the bid and at the ask), and percentage MWBBO spread.17 The last five explanatory variables represent market conditions in the interval: return (computed from the last transaction price in each interval), volatility (computed as the absolute value of return), the average time between trades in the interval, the return on an ETF that tracks the S&P/TSX Capped Composite Index (XIC), and the ETF return volatility.18 The ETF return and volatility variables can also be viewed as potential signals for short-term speculation by HFT firms. All explanatory variables are standardized to have zero mean and unit standard deviation to make the coefficients comparable across variables. Table 3 shows the regression coefficients together with $$t$$-statistics computed from double-clustered standard errors (along both the stocks and intervals dimensions) to focus our attention on the most significant relationships. Looking at the regression on the first principal component scores, we observe that the coefficient on HFT cross-venue activity (HFTcrossmsg) is positive and highly statistically significant. HFT firms that load on this principal component are also more active when prices are more volatile (a positive and significant coefficient on |Return|), and there is less depth in the book (Depth10) and greater imbalance at the top of the book (|TopDepthImb|). These are times at which some trading venues post better spreads than others (a negative and significant coefficient on SpreadAlign) and there appears to be a need to move liquidity across trading venues, which the HFT firms indeed do with their more intense cross-venue message activity. This suggests that the first principal component could represent a cross-venue arbitrage strategy. Table 3 Regressions using principal component scores PC1 PC2 PC3 Variable Coef. $$t$$-stat. Coef. $$t$$-stat. Coef. $$t$$-stat. HFTcrossmsg 0.6717 8.25 0.1548 1.55 0.3330 3.74 HFTliqsupply –0.0285 –1.07 0.3556 10.61 –0.0973 –4.12 |HFTinventory| 0.0051 1.68 –0.0047 –1.40 0.0013 0.90 PriceAlign –0.0050 –0.41 0.0299 2.60 –0.0386 –4.45 SpreadAlign –0.0261 –5.00 0.0159 2.53 0.0013 0.33 Depth10 –0.0428 –4.84 0.0737 4.76 –0.0278 –2.43 |TopDepthImb| 0.0057 2.55 –0.0193 –2.74 0.0078 2.10 %Spread –0.0077 –2.07 0.0030 1.43 0.0068 2.01 Return 0.0015 2.04 0.0007 0.91 0.0007 0.97 |Return| 0.0262 2.54 –0.0023 –0.39 0.0256 3.45 Time-bet-trades –0.0850 –5.54 –0.1051 –6.53 0.0151 1.73 ETF return –0.0004 –1.65 0.0002 0.62 0.0004 0.71 |ETF return| 0.0006 1.39 0.0000 0.00 0.0032 4.24 Intercept 0.0000 0.00 0.0000 0.00 0.0000 0.00 R$$^{\mathrm{2}}$$ (%) 49.61 27.40 9.84 PC1 PC2 PC3 Variable Coef. $$t$$-stat. Coef. $$t$$-stat. Coef. $$t$$-stat. HFTcrossmsg 0.6717 8.25 0.1548 1.55 0.3330 3.74 HFTliqsupply –0.0285 –1.07 0.3556 10.61 –0.0973 –4.12 |HFTinventory| 0.0051 1.68 –0.0047 –1.40 0.0013 0.90 PriceAlign –0.0050 –0.41 0.0299 2.60 –0.0386 –4.45 SpreadAlign –0.0261 –5.00 0.0159 2.53 0.0013 0.33 Depth10 –0.0428 –4.84 0.0737 4.76 –0.0278 –2.43 |TopDepthImb| 0.0057 2.55 –0.0193 –2.74 0.0078 2.10 %Spread –0.0077 –2.07 0.0030 1.43 0.0068 2.01 Return 0.0015 2.04 0.0007 0.91 0.0007 0.97 |Return| 0.0262 2.54 –0.0023 –0.39 0.0256 3.45 Time-bet-trades –0.0850 –5.54 –0.1051 –6.53 0.0151 1.73 ETF return –0.0004 –1.65 0.0002 0.62 0.0004 0.71 |ETF return| 0.0006 1.39 0.0000 0.00 0.0032 4.24 Intercept 0.0000 0.00 0.0000 0.00 0.0000 0.00 R$$^{\mathrm{2}}$$ (%) 49.61 27.40 9.84 This table presents regressions of principal component scores on variables that represent the market environment. In the PCA of the high-frequency trading (HFT) activity, the 31 HFT firms are the “variables,” while the observations are all 1-second intervals during the 30-day sample period for all sample stocks. The PCA uses the varimax orthogonal rotation, and the first three principal components are retained for further analysis. There are 13 right-hand-side variables in each regression that describe attributes of the HFT strategies, as well as the market environment. We identify only the HFT firms that simultaneously (i.e., in the same interval) submit messages to multiple trading venues, and HFTcrossmsg is the aggregate number of these messages for the 31 HFT firms. HFTliqsupply is the aggregate number of passive (i.e., liquidity supplying) trades for the HFT firms, and |HFTinventory| is the absolute value of the aggregate inventory position of all HFT firms in Canadian dollars (cumulative from the beginning of the day and assuming that all of them start the day with zero inventory). The next two variables represent the degree of integration of the Canadian market. Specifically, PriceAlign is the percentage of time that the three trading venues with the highest market share post the same bid or ask prices, and SpreadAlign is the percentage of time that the three trading venues have the same bid-ask spread. The next three variables represent the state of liquidity in the market (aggregated across all trading venues): total depth up to 10 cents from the market-wide best bid or offer (MWBBO), depth imbalance at the MWBBO (defined as the absolute value of the difference between the number of shares at the bid and at the ask), and percentage MWBBO spread. The last five variables represent market conditions in the interval: return (computed from the last transaction price in each interval), volatility (computed as the absolute value of return), the average time between trades in the interval, return on an ETF that represents the market, and ETF volatility (|ETF Return|). All right-hand-side variables are standardized to have zero mean and unit standard deviation to make the coefficients comparable across variables. We report the regression coefficients together with $$t$$-statistics computed using double-clustered (interval and stock) standard errors. Table 3 Regressions using principal component scores PC1 PC2 PC3 Variable Coef. $$t$$-stat. Coef. $$t$$-stat. Coef. $$t$$-stat. HFTcrossmsg 0.6717 8.25 0.1548 1.55 0.3330 3.74 HFTliqsupply –0.0285 –1.07 0.3556 10.61 –0.0973 –4.12 |HFTinventory| 0.0051 1.68 –0.0047 –1.40 0.0013 0.90 PriceAlign –0.0050 –0.41 0.0299 2.60 –0.0386 –4.45 SpreadAlign –0.0261 –5.00 0.0159 2.53 0.0013 0.33 Depth10 –0.0428 –4.84 0.0737 4.76 –0.0278 –2.43 |TopDepthImb| 0.0057 2.55 –0.0193 –2.74 0.0078 2.10 %Spread –0.0077 –2.07 0.0030 1.43 0.0068 2.01 Return 0.0015 2.04 0.0007 0.91 0.0007 0.97 |Return| 0.0262 2.54 –0.0023 –0.39 0.0256 3.45 Time-bet-trades –0.0850 –5.54 –0.1051 –6.53 0.0151 1.73 ETF return –0.0004 –1.65 0.0002 0.62 0.0004 0.71 |ETF return| 0.0006 1.39 0.0000 0.00 0.0032 4.24 Intercept 0.0000 0.00 0.0000 0.00 0.0000 0.00 R$$^{\mathrm{2}}$$ (%) 49.61 27.40 9.84 PC1 PC2 PC3 Variable Coef. $$t$$-stat. Coef. $$t$$-stat. Coef. $$t$$-stat. HFTcrossmsg 0.6717 8.25 0.1548 1.55 0.3330 3.74 HFTliqsupply –0.0285 –1.07 0.3556 10.61 –0.0973 –4.12 |HFTinventory| 0.0051 1.68 –0.0047 –1.40 0.0013 0.90 PriceAlign –0.0050 –0.41 0.0299 2.60 –0.0386 –4.45 SpreadAlign –0.0261 –5.00 0.0159 2.53 0.0013 0.33 Depth10 –0.0428 –4.84 0.0737 4.76 –0.0278 –2.43 |TopDepthImb| 0.0057 2.55 –0.0193 –2.74 0.0078 2.10 %Spread –0.0077 –2.07 0.0030 1.43 0.0068 2.01 Return 0.0015 2.04 0.0007 0.91 0.0007 0.97 |Return| 0.0262 2.54 –0.0023 –0.39 0.0256 3.45 Time-bet-trades –0.0850 –5.54 –0.1051 –6.53 0.0151 1.73 ETF return –0.0004 –1.65 0.0002 0.62 0.0004 0.71 |ETF return| 0.0006 1.39 0.0000 0.00 0.0032 4.24 Intercept 0.0000 0.00 0.0000 0.00 0.0000 0.00 R$$^{\mathrm{2}}$$ (%) 49.61 27.40 9.84 This table presents regressions of principal component scores on variables that represent the market environment. In the PCA of the high-frequency trading (HFT) activity, the 31 HFT firms are the “variables,” while the observations are all 1-second intervals during the 30-day sample period for all sample stocks. The PCA uses the varimax orthogonal rotation, and the first three principal components are retained for further analysis. There are 13 right-hand-side variables in each regression that describe attributes of the HFT strategies, as well as the market environment. We identify only the HFT firms that simultaneously (i.e., in the same interval) submit messages to multiple trading venues, and HFTcrossmsg is the aggregate number of these messages for the 31 HFT firms. HFTliqsupply is the aggregate number of passive (i.e., liquidity supplying) trades for the HFT firms, and |HFTinventory| is the absolute value of the aggregate inventory position of all HFT firms in Canadian dollars (cumulative from the beginning of the day and assuming that all of them start the day with zero inventory). The next two variables represent the degree of integration of the Canadian market. Specifically, PriceAlign is the percentage of time that the three trading venues with the highest market share post the same bid or ask prices, and SpreadAlign is the percentage of time that the three trading venues have the same bid-ask spread. The next three variables represent the state of liquidity in the market (aggregated across all trading venues): total depth up to 10 cents from the market-wide best bid or offer (MWBBO), depth imbalance at the MWBBO (defined as the absolute value of the difference between the number of shares at the bid and at the ask), and percentage MWBBO spread. The last five variables represent market conditions in the interval: return (computed from the last transaction price in each interval), volatility (computed as the absolute value of return), the average time between trades in the interval, return on an ETF that represents the market, and ETF volatility (|ETF Return|). All right-hand-side variables are standardized to have zero mean and unit standard deviation to make the coefficients comparable across variables. We report the regression coefficients together with $$t$$-statistics computed using double-clustered (interval and stock) standard errors. We want to be clear that, as is always the case with PCA, the principal components do not possess an inherent economic interpretation. In every application of this methodology, however, researchers are tasked with giving them one based on the output. We evaluate the regression coefficients of the principal component scores on attributes of the market environment, and our interpretation simply reflects our reading of these coefficients. Whether these labels (e.g., cross-venue arbitrage) are accurate or not does not detract from our use of principal components as data-driven representations of underlying common strategies in a way that can help us understand the competitive landscape of HFT firms. Our use of the principal components in the rest of the paper does not depend on these labels. Turning to the second principal component, we find that the coefficient on trading passively by providing liquidity (HFTliqsupply) is large and highly statistically significant. This underlying common strategy is more active when prices and spreads are more tightly aligned across the trading venues (positive coefficients on PriceAlign and SpreadAlign), and may represent times when market-making activity is most profitable (i.e., when market-making firms earn the spread rather than lose to changing prices). This strategy is more active when there is greater depth in the book and a smaller imbalance in the top prices, creating ample opportunities for market making without excessive risk.19 This combination of coefficients, and especially the large coefficient on passive trading that is traditionally associated with market makers, could indicate that this underlying common strategy is related to market making. The underlying common strategy represented by the third principal component has a combination of statistically significant coefficients that suggests a distinct strategy, although it shares some attributes with the first principal component. HFT firms that load on the third principal component are also more active when prices are more volatile, when depth in the book is lower, and there is greater imbalance at the top prices. Two of the economic relationships that make the third principal component different are a significant negative coefficient on PriceAlign, which means that the best prices are not the same across the trading venues, and a negative and significant coefficient on HFTliqsupply, which indicates more active (rather than passive) HFT trading. This pattern of relationships with the market environment could represent a short-horizon directional strategy (e.g., momentum) that requires quick trading using marketable orders and is more profitable when there is less depth throughout the book and prices are fragmented across the trading venues. Furthermore, this is the only principal component with a significant coefficient on an ETF variable. The positive relationship with ETF volatility (|ETF Return|) indicates that the short-horizon directional speculation could involve signals on market movements. There could certainly be other labels for the underlying common strategies that would fit such patterns in regression coefficients. The important takeaway from the regressions, however, is not the labels. Rather, it is the recognition that there is heterogeneity in the underlying common strategies, as evidenced by the distinct relationships we document between each principal component and the variables that represent various aspects of the market environment. Furthermore, there appears to be a significant number of HFT firms that pursue unique strategies, although they represent a small portion of HFT activity. The heterogeneity in HFT strategies is important insofar as the insights generated by thinking about HFT as a single “entity” could be rather limited because aggregating HFT activity hides diversity that is important for understanding how HFT firms interact with markets and ultimately affect them. 4. HFT Competition and the Market Environment Having established in Section 3 that there are three distinct product categories, we proceed to investigate how competition between HFT firms in each product category affects the market environment. As discussed in Section 3.1, we rely on results from the industrial organization literature to motivate using the extent of similarity in strategies as a measure of the intensity of competition. The more similar are the strategies of HFT firms, the more intensely these firms compete with one another. We use these similarity measures to examine whether HFT competition could exacerbate volatility and study how HFT competition interacts with competition between trading venues. 4.1 HFT competition and volatility Greater similarity in strategies brings about greater competition but also means that multiple HFT firms engage in activities highly correlated across stocks. The dominance of HFT firms in the trading environment coupled with this highly correlated activity could potentially amplify idiosyncratic occurrences at the microstructure level and spread them across the market. This seems pertinent given our finding in Section 3 that the 14 HFT firms that pursue correlated strategies represent most of the HFT activity (96.21% of the messages that HFT firms send to the market and 78.97% of the volume they trade). While the question of whether the magnitude of HFT activity affects volatility has been studied elsewhere (see, e.g., Hagströmer and Norden 2013; Hasbrouck and Saar 2013; Brogaard et al. Forthcoming), our focus is on the question of how competition between HFT firms relates to volatility. Specifically, we study how the short-horizon volatility of stocks loads on a time-varying market-wide measure that captures the similarity in HFT strategies across stocks in each of the product categories. For each 1-second time interval, we construct the similarity measure by computing the correlation coefficient between the activities of pairs of HFT firms across the stocks in the sample and averaging the correlations for all pairs of HFT firms in each product category.20 Specifically, let MSG$$_{\it itk} $$ be the message activity of HFT firm $$k$$ in stock $$i$$ and interval $$t$$. If $$k$$ and $$l$$ are two HFT firms that follow the underlying common strategy we associate with the first principal component, then Cor(MSG$$_{\mathrm{k}}$$,MSG$$_{\mathrm{l}})_{\mathrm{t}}$$ is the cross-stock correlation between the activities of these two firms for interval $$t$$. There are $$0.5*8*(8-1)=28$$ such correlations for the group of eight HFT firms that follow this underlying common strategy, and the market-wide similarity measure for interval $$t$$, MktSimPC1$$_{t}$$, is computed as the average of these correlations for the 28 pairs. We compute MktSimPC2$$_{t}$$ and MktSimPC3$$_{t}$$ in the same fashion as the similarity measures for the groups of firms that follow the underlying common strategies we associate with the second and third principal components, respectively.21 While MSG is a comprehensive representation of HFT activity in that it incorporates every action a firm initiates over an interval, TRD and LMT represent other aspects of an HFT strategy.22 A priori, it is unclear which one is more important in explaining volatility. On the one hand, TRD focuses on trade executions, and as such it could be more tightly related to price changes. On the other hand, most of the activity of HFT firms involves order flow that does not culminate in trades (limit order submission and cancellation). Including such orders in the representation of the strategy may result in a better description of HFT activity and therefore could be more appropriate when analyzing how HFT activity affects volatility. We use similarity measures computed from MSG, TRD, and LMT to examine whether the results are sensitive to the specific representation of the strategies.23 We investigate whether HFT competition affects the short-horizon volatility of individual stocks by running the following regression: \begin{align} \left| {r_{it} } \right| & = a_{0i} +a_{1i} \textit{MktSimPC}1_{t} +a_{2i} \textit{MktSimPC}2_{t} +a_{3i} \textit{MktSimPC}3_{t} \notag\\ &\quad +a_{4i} \textit{MktPC}1_{t} + a_{5i} \textit{MktPC}2_{t} +a_{6i} \textit{MktPC}3_{t} +a_{7i} \textit{MktRet}_{t} \notag\\ &\quad +a_{8i} \left| {\textit{MktRet}_{t} } \right|+a_{9i} \textit{MktVol}_{t} +\textit{error}_{it} \,, \end{align} (1) where |$$r_{it}$$| is the absolute value of the interval return for stock $$i$$ in interval $$t$$ that we use as our measure of short-horizon (or interval) volatility.24 The right-hand side of Equation (1) contains only market-wide variables. Since we are interested in studying the effect of similarity in HFT strategies, not the magnitude of HFT activity, we add as control variables the aggregated activity of the HFT firms that follow the three principal components (MktPC1$$_{t}$$, MktPC2$$_{t}$$, and MktPC3$$_{t})$$. Like in a typical market model, we include the value-weighted return of all stocks in our sample (MktRet$$_{t})$$, and we also control for market volatility (the absolute value of the market return) and market volume (aggregated for all stocks).25 We do not believe that the model in Equation (1) suffers from endogeneity in the form of either reverse causality or omitted variables. With respect to the former, it is generally accepted that there is no reverse causality when one regresses a stock attribute on market attributes. When we run a market model, for example, we normally assume that the stock return does not “cause” the market return. Similarly, the volatility of a single stock in Equation (1) does not cause market volatility or the market-wide similarity in HFT strategies that is computed using HFT activity across all stocks in the sample. As for omitted variables, conventional wisdom suggests that the activity of HFT firms is driven by volume and volatility.26 It is certainly possible that a market volatility shock, for example, could affect both stock volatility (if stocks respond to market conditions) and the extent of similarity in HFT strategies. Once market volatility is controlled for in the regression, however, the shock will not be captured by the similarity measures. Panel A of Table 4 presents the correlations between the similarity measures and the rest of the market-wide variables used in the regressions. The highest correlation between two similarity measures is 0.308, and the correlations between the similarity measures and the control variables range from $$-$$0.215 to 0.130, which does not preclude having them together in a single regression specification. Panel B presents the results of 52 regressions, one for each stock, for the model specified in Equation (1). For each independent variable, we provide the average coefficient value, a $$t$$-statistic computed from the cross-sectional distribution of the regression coefficients, and several summary statistics (e.g., the number of positive and negative coefficients). Table 4 Short-horizon stock volatility and HFT competition A. Correlations of HFT similarity measures with all other independent variables MSG TRD LMT MktSimP C1 MktSimP C3 MktSimP C3 MktSimP C1 MktSimP C3 MktSimP C3 MktSimP C1 MktSimP C3 MktSimP C3 MktSimPC1 1 0.282 0.194 1 0.278 0.194 1 0.308 0.137 MktSimPC2 0.282 1 –0.012 0.278 1 –0.012 0.308 1 0.114 MktSimPC3 0.194 –0.012 1 0.194 –0.012 1 0.137 0.114 1 MktPC1 –0.073 0.013 –0.036 –0.076 0.012 –0.035 0.001 –0.031 –0.151 MktPC2 0.018 0.130 –0.064 0.016 0.129 –0.064 –0.031 0.020 –0.170 MktPC3 –0.147 –0.007 –0.021 –0.149 –0.009 –0.021 –0.034 –0.062 –0.131 MktRet 0.001 –0.001 0.000 0.001 –0.001 0.000 –0.005 –0.003 0.013 |MktRet| –0.025 0.013 –0.031 –0.027 0.010 –0.031 –0.113 –0.108 –0.215 MktVol 0.013 0.009 –0.004 0.013 0.008 –0.004 –0.007 –0.008 –0.132 A. Correlations of HFT similarity measures with all other independent variables MSG TRD LMT MktSimP C1 MktSimP C3 MktSimP C3 MktSimP C1 MktSimP C3 MktSimP C3 MktSimP C1 MktSimP C3 MktSimP C3 MktSimPC1 1 0.282 0.194 1 0.278 0.194 1 0.308 0.137 MktSimPC2 0.282 1 –0.012 0.278 1 –0.012 0.308 1 0.114 MktSimPC3 0.194 –0.012 1 0.194 –0.012 1 0.137 0.114 1 MktPC1 –0.073 0.013 –0.036 –0.076 0.012 –0.035 0.001 –0.031 –0.151 MktPC2 0.018 0.130 –0.064 0.016 0.129 –0.064 –0.031 0.020 –0.170 MktPC3 –0.147 –0.007 –0.021 –0.149 –0.009 –0.021 –0.034 –0.062 –0.131 MktRet 0.001 –0.001 0.000 0.001 –0.001 0.000 –0.005 –0.003 0.013 |MktRet| –0.025 0.013 –0.031 –0.027 0.010 –0.031 –0.113 –0.108 –0.215 MktVol 0.013 0.009 –0.004 0.013 0.008 –0.004 –0.007 –0.008 –0.132 B. Regressions of stock short-horizon volatility on HFT competition Avg. Avg. # coef # $$t$$-stat # coef # $$t$$-stat Avg. R$$^{\mathrm{2}}$$ coef $$t$$-stat $$<$$ 0 $$<-1.96$$ $$>$$ 0 $$>$$ 1.96 (%) MSG Intercept 9.39 × 10-6 3.99 8 7 44 44 6.13 MktSimPC1 –2.52 × 10-6 –0.81 33 30 19 13 MktSimPC2 –8.90 × 10-6 –4.29 41 39 11 10 MktSimPC3 –6.84 × 10-7 –0.68 37 29 15 9 MktPC1 –1.20 × 10-9 –0.63 34 33 18 14 MktPC2 7.96 × 10-9 3.87 10 7 42 39 MktPC3 9.25 × 10-9 5.76 8 7 44 42 MktRet –6.66 × 10-3 –1.58 33 8 19 3 |MktRet| 8.56 × 10-1 10.81 0 0 52 52 MktVol 2.02 × 10-12 10.90 4 2 48 45 TRD Intercept 4.37 × 10-5 7.27 1 0 51 50 12.56 MktSimPC1 –1.01 × 10-5 –4.84 42 27 10 3 MktSimPC2 –1.91 × 10-5 –9.78 48 43 4 0 MktSimPC3 –2.96 × 10-5 –3.80 48 41 4 3 MktPC1 –2.35 × 10-8 –0.71 34 15 18 9 MktPC2 8.13 × 10-7 0.27 25 11 27 12 MktPC3 4.03 × 10-7 4.95 11 2 41 28 MktRet 1.51 × 10-3 0.15 23 4 29 4 |MktRet| 9.87 × 10-1 20.81 0 0 52 52 MktVol –2.20 × 10-13 –0.41 31 10 21 11 LMT Intercept 9.42 × 10-6 4.00 8 7 44 44 6.12 MktSimPC1 –2.54 × 10-6 –0.82 33 31 19 13 MktSimPC2 –8.65 × 10-6 –4.24 41 39 11 10 MktSimPC3 –7.39 × 10-7 –0.73 37 29 15 9 MktPC1 –7.74 × 10-10 –0.42 34 34 18 15 MktPC2 6.98 × 10-9 3.52 12 7 40 39 MktPC3 8.96 × 10-9 5.76 7 7 45 42 MktRet –6.64 × 10-3 –1.58 33 8 19 3 |MktRet| 8.58 × 10-1 10.88 0 0 52 52 MktVol 2.05 × 10-12 10.86 4 2 48 45 B. Regressions of stock short-horizon volatility on HFT competition Avg. Avg. # coef # $$t$$-stat # coef # $$t$$-stat Avg. R$$^{\mathrm{2}}$$ coef $$t$$-stat $$<$$ 0 $$<-1.96$$ $$>$$ 0 $$>$$ 1.96 (%) MSG Intercept 9.39 × 10-6 3.99 8 7 44 44 6.13 MktSimPC1 –2.52 × 10-6 –0.81 33 30 19 13 MktSimPC2 –8.90 × 10-6 –4.29 41 39 11 10 MktSimPC3 –6.84 × 10-7 –0.68 37 29 15 9 MktPC1 –1.20 × 10-9 –0.63 34 33 18 14 MktPC2 7.96 × 10-9 3.87 10 7 42 39 MktPC3 9.25 × 10-9 5.76 8 7 44 42 MktRet –6.66 × 10-3 –1.58 33 8 19 3 |MktRet| 8.56 × 10-1 10.81 0 0 52 52 MktVol 2.02 × 10-12 10.90 4 2 48 45 TRD Intercept 4.37 × 10-5 7.27 1 0 51 50 12.56 MktSimPC1 –1.01 × 10-5 –4.84 42 27 10 3 MktSimPC2 –1.91 × 10-5 –9.78 48 43 4 0 MktSimPC3 –2.96 × 10-5 –3.80 48 41 4 3 MktPC1 –2.35 × 10-8 –0.71 34 15 18 9 MktPC2 8.13 × 10-7 0.27 25 11 27 12 MktPC3 4.03 × 10-7 4.95 11 2 41 28 MktRet 1.51 × 10-3 0.15 23 4 29 4 |MktRet| 9.87 × 10-1 20.81 0 0 52 52 MktVol –2.20 × 10-13 –0.41 31 10 21 11 LMT Intercept 9.42 × 10-6 4.00 8 7 44 44 6.12 MktSimPC1 –2.54 × 10-6 –0.82 33 31 19 13 MktSimPC2 –8.65 × 10-6 –4.24 41 39 11 10 MktSimPC3 –7.39 × 10-7 –0.73 37 29 15 9 MktPC1 –7.74 × 10-10 –0.42 34 34 18 15 MktPC2 6.98 × 10-9 3.52 12 7 40 39 MktPC3 8.96 × 10-9 5.76 7 7 45 42 MktRet –6.64 × 10-3 –1.58 33 8 19 3 |MktRet| 8.58 × 10-1 10.88 0 0 52 52 MktVol 2.05 × 10-12 10.86 4 2 48 45 This table presents the results of regressions of short-horizon stock volatility on time-varying, market-wide measures that capture similarity in high-frequency trading (HFT) strategies across stocks in each of the product categories (or underlying common strategies). For each stock, we estimate the following regression over all 1-second intervals in the sample period: \begin{align*} \left| {r_{it} } \right|&= a_{0i} +a_{1i}\textit{MktSimPC}1_{t} +a_{2i} \textit{MktSimPC}2_{t} +a_{3i} \textit{MktSimPC}3_{t} +a_{4i} \textit{MktPC}1_{t}\\&\quad + a_{5i} \textit{MktPC}2_{t} +a_{6i} \textit{MktPC}3_{t} +a_{7i} \textit{MktRet}_{t} +a_{8i} \left| {\textit{MktRet}_{t} } \right|+a_{9i} \textit{MktVol}_{t} + \textit{error}_{it} , \end{align*} where |$$r_{it}$$| is the absolute value of the return on stock $$i$$ in interval $$t$$, which is our measure of interval return volatility, MktRet$$_{t}$$ is the value-weighted return of all stocks in our sample, |MktRet$$_{t}$$| is the absolute value of the market return, and MktVol$$_{t}$$ is the aggregate volume in all stocks. We compute the similarity measures that serve as proxies for HFT competition in the three product categories as follows. For each interval $$t$$, MktSimPC1$$_{t} $$ is the average over the cross-stocks correlations of all pairs of HFT firms from among the eight firms that load significantly on the first principal component, and MktSimPC2$$_{t}$$(MktSimPC3$$_{t})$$ is computed in an analogous fashion for the firms that load significantly on the second (third) principal component. We add the magnitude of HFT activity of the firms that load on the three principal components as control variables: MktPC1$$_{t}$$, MktPC2$$_{t}, $$ and MktPC3$$_{t}$$. We compute the similarity measures for three variables that represent HFT strategies: MSG (all messages HFT firms send to the market), TRD (all their trades), and LMT (submissions and cancellations of nonmarketable limit orders). Panel A shows the Pearson correlations between the measures of similarity in HFT strategies and all control variables in the regressions. Panel B presents for each independent variable the average coefficient across the 52 stocks, $$t$$-statistic computed from the cross-sectional distribution of the regression coefficients, the number of negative coefficients, the number of negative coefficients that are significant at the 5% level (from a two-sided test), the number of positive coefficients, the number of significant positive coefficients, and the average R$$^{\mathrm{2}}$$. Table 4 Short-horizon stock volatility and HFT competition A. Correlations of HFT similarity measures with all other independent variables MSG TRD LMT MktSimP C1 MktSimP C3 MktSimP C3 MktSimP C1 MktSimP C3 MktSimP C3 MktSimP C1 MktSimP C3 MktSimP C3 MktSimPC1 1 0.282 0.194 1 0.278 0.194 1 0.308 0.137 MktSimPC2 0.282 1 –0.012 0.278 1 –0.012 0.308 1 0.114 MktSimPC3 0.194 –0.012 1 0.194 –0.012 1 0.137 0.114 1 MktPC1 –0.073 0.013 –0.036 –0.076 0.012 –0.035 0.001 –0.031 –0.151 MktPC2 0.018 0.130 –0.064 0.016 0.129 –0.064 –0.031 0.020 –0.170 MktPC3 –0.147 –0.007 –0.021 –0.149 –0.009 –0.021 –0.034 –0.062 –0.131 MktRet 0.001 –0.001 0.000 0.001 –0.001 0.000 –0.005 –0.003 0.013 |MktRet| –0.025 0.013 –0.031 –0.027 0.010 –0.031 –0.113 –0.108 –0.215 MktVol 0.013 0.009 –0.004 0.013 0.008 –0.004 –0.007 –0.008 –0.132 A. Correlations of HFT similarity measures with all other independent variables MSG TRD LMT MktSimP C1 MktSimP C3 MktSimP C3 MktSimP C1 MktSimP C3 MktSimP C3 MktSimP C1 MktSimP C3 MktSimP C3 MktSimPC1 1 0.282 0.194 1 0.278 0.194 1 0.308 0.137 MktSimPC2 0.282 1 –0.012 0.278 1 –0.012 0.308 1 0.114 MktSimPC3 0.194 –0.012 1 0.194 –0.012 1 0.137 0.114 1 MktPC1 –0.073 0.013 –0.036 –0.076 0.012 –0.035 0.001 –0.031 –0.151 MktPC2 0.018 0.130 –0.064 0.016 0.129 –0.064 –0.031 0.020 –0.170 MktPC3 –0.147 –0.007 –0.021 –0.149 –0.009 –0.021 –0.034 –0.062 –0.131 MktRet 0.001 –0.001 0.000 0.001 –0.001 0.000 –0.005 –0.003 0.013 |MktRet| –0.025 0.013 –0.031 –0.027 0.010 –0.031 –0.113 –0.108 –0.215 MktVol 0.013 0.009 –0.004 0.013 0.008 –0.004 –0.007 –0.008 –0.132 B. Regressions of stock short-horizon volatility on HFT competition Avg. Avg. # coef # $$t$$-stat # coef # $$t$$-stat Avg. R$$^{\mathrm{2}}$$ coef $$t$$-stat $$<$$ 0 $$<-1.96$$ $$>$$ 0 $$>$$ 1.96 (%) MSG Intercept 9.39 × 10-6 3.99 8 7 44 44 6.13 MktSimPC1 –2.52 × 10-6 –0.81 33 30 19 13 MktSimPC2 –8.90 × 10-6 –4.29 41 39 11 10 MktSimPC3 –6.84 × 10-7 –0.68 37 29 15 9 MktPC1 –1.20 × 10-9 –0.63 34 33 18 14 MktPC2 7.96 × 10-9 3.87 10 7 42 39 MktPC3 9.25 × 10-9 5.76 8 7 44 42 MktRet –6.66 × 10-3 –1.58 33 8 19 3 |MktRet| 8.56 × 10-1 10.81 0 0 52 52 MktVol 2.02 × 10-12 10.90 4 2 48 45 TRD Intercept 4.37 × 10-5 7.27 1 0 51 50 12.56 MktSimPC1 –1.01 × 10-5 –4.84 42 27 10 3 MktSimPC2 –1.91 × 10-5 –9.78 48 43 4 0 MktSimPC3 –2.96 × 10-5 –3.80 48 41 4 3 MktPC1 –2.35 × 10-8 –0.71 34 15 18 9 MktPC2 8.13 × 10-7 0.27 25 11 27 12 MktPC3 4.03 × 10-7 4.95 11 2 41 28 MktRet 1.51 × 10-3 0.15 23 4 29 4 |MktRet| 9.87 × 10-1 20.81 0 0 52 52 MktVol –2.20 × 10-13 –0.41 31 10 21 11 LMT Intercept 9.42 × 10-6 4.00 8 7 44 44 6.12 MktSimPC1 –2.54 × 10-6 –0.82 33 31 19 13 MktSimPC2 –8.65 × 10-6 –4.24 41 39 11 10 MktSimPC3 –7.39 × 10-7 –0.73 37 29 15 9 MktPC1 –7.74 × 10-10 –0.42 34 34 18 15 MktPC2 6.98 × 10-9 3.52 12 7 40 39 MktPC3 8.96 × 10-9 5.76 7 7 45 42 MktRet –6.64 × 10-3 –1.58 33 8 19 3 |MktRet| 8.58 × 10-1 10.88 0 0 52 52 MktVol 2.05 × 10-12 10.86 4 2 48 45 B. Regressions of stock short-horizon volatility on HFT competition Avg. Avg. # coef # $$t$$-stat # coef # $$t$$-stat Avg. R$$^{\mathrm{2}}$$ coef $$t$$-stat $$<$$ 0 $$<-1.96$$ $$>$$ 0 $$>$$ 1.96 (%) MSG Intercept 9.39 × 10-6 3.99 8 7 44 44 6.13 MktSimPC1 –2.52 × 10-6 –0.81 33 30 19 13 MktSimPC2 –8.90 × 10-6 –4.29 41 39 11 10 MktSimPC3 –6.84 × 10-7 –0.68 37 29 15 9 MktPC1 –1.20 × 10-9 –0.63 34 33 18 14 MktPC2 7.96 × 10-9 3.87 10 7 42 39 MktPC3 9.25 × 10-9 5.76 8 7 44 42 MktRet –6.66 × 10-3 –1.58 33 8 19 3 |MktRet| 8.56 × 10-1 10.81 0 0 52 52 MktVol 2.02 × 10-12 10.90 4 2 48 45 TRD Intercept 4.37 × 10-5 7.27 1 0 51 50 12.56 MktSimPC1 –1.01 × 10-5 –4.84 42 27 10 3 MktSimPC2 –1.91 × 10-5 –9.78 48 43 4 0 MktSimPC3 –2.96 × 10-5 –3.80 48 41 4 3 MktPC1 –2.35 × 10-8 –0.71 34 15 18 9 MktPC2 8.13 × 10-7 0.27 25 11 27 12 MktPC3 4.03 × 10-7 4.95 11 2 41 28 MktRet 1.51 × 10-3 0.15 23 4 29 4 |MktRet| 9.87 × 10-1 20.81 0 0 52 52 MktVol –2.20 × 10-13 –0.41 31 10 21 11 LMT Intercept 9.42 × 10-6 4.00 8 7 44 44 6.12 MktSimPC1 –2.54 × 10-6 –0.82 33 31 19 13 MktSimPC2 –8.65 × 10-6 –4.24 41 39 11 10 MktSimPC3 –7.39 × 10-7 –0.73 37 29 15 9 MktPC1 –7.74 × 10-10 –0.42 34 34 18 15 MktPC2 6.98 × 10-9 3.52 12 7 40 39 MktPC3 8.96 × 10-9 5.76 7 7 45 42 MktRet –6.64 × 10-3 –1.58 33 8 19 3 |MktRet| 8.58 × 10-1 10.88 0 0 52 52 MktVol 2.05 × 10-12 10.86 4 2 48 45 This table presents the results of regressions of short-horizon stock volatility on time-varying, market-wide measures that capture similarity in high-frequency trading (HFT) strategies across stocks in each of the product categories (or underlying common strategies). For each stock, we estimate the following regression over all 1-second intervals in the sample period: \begin{align*} \left| {r_{it} } \right|&= a_{0i} +a_{1i}\textit{MktSimPC}1_{t} +a_{2i} \textit{MktSimPC}2_{t} +a_{3i} \textit{MktSimPC}3_{t} +a_{4i} \textit{MktPC}1_{t}\\&\quad + a_{5i} \textit{MktPC}2_{t} +a_{6i} \textit{MktPC}3_{t} +a_{7i} \textit{MktRet}_{t} +a_{8i} \left| {\textit{MktRet}_{t} } \right|+a_{9i} \textit{MktVol}_{t} + \textit{error}_{it} , \end{align*} where |$$r_{it}$$| is the absolute value of the return on stock $$i$$ in interval $$t$$, which is our measure of interval return volatility, MktRet$$_{t}$$ is the value-weighted return of all stocks in our sample, |MktRet$$_{t}$$| is the absolute value of the market return, and MktVol$$_{t}$$ is the aggregate volume in all stocks. We compute the similarity measures that serve as proxies for HFT competition in the three product categories as follows. For each interval $$t$$, MktSimPC1$$_{t} $$ is the average over the cross-stocks correlations of all pairs of HFT firms from among the eight firms that load significantly on the first principal component, and MktSimPC2$$_{t}$$(MktSimPC3$$_{t})$$ is computed in an analogous fashion for the firms that load significantly on the second (third) principal component. We add the magnitude of HFT activity of the firms that load on the three principal components as control variables: MktPC1$$_{t}$$, MktPC2$$_{t}, $$ and MktPC3$$_{t}$$. We compute the similarity measures for three variables that represent HFT strategies: MSG (all messages HFT firms send to the market), TRD (all their trades), and LMT (submissions and cancellations of nonmarketable limit orders). Panel A shows the Pearson correlations between the measures of similarity in HFT strategies and all control variables in the regressions. Panel B presents for each independent variable the average coefficient across the 52 stocks, $$t$$-statistic computed from the cross-sectional distribution of the regression coefficients, the number of negative coefficients, the number of negative coefficients that are significant at the 5% level (from a two-sided test), the number of positive coefficients, the number of significant positive coefficients, and the average R$$^{\mathrm{2}}$$. We observe a rather striking relationship: the loadings on the HFT similarity measures that represent competition between HFT firms in the three underlying common strategies are all negative. A closer look suggests that for both MSG and LMT, greater similarity between strategies of HFT firms that follow the second principal component (MktSimPC2) seems to drive a significant portion of the relationship: its mean coefficient is the most negative, the $$t$$-statistic computed from the cross-sectional distribution of the coefficients points to a highly significant relationship, and there are 41 stocks (out of 52) for which this variable has negative loadings in the individual stock regressions.27 This principal component represents the underlying common strategy that we associate with market making based on the component scores regressions in Section 3. The magnitude of the effect is also economically meaningful. An increase in MktSimPC2 from 0 to 0.5, for example, would result in a decrease of 16.1% (MSG) to 34.4% (TRD) in 1-second volatility.28 As expected, the volatility of individual stocks increases with the volatility of the market. In contrast, the loadings on the HFT similarity measures are predominantly negative, suggesting that HFT competition does not serve as a proxy for market volatility, but rather represents a separate and distinct effect.29 Our results, therefore, contrast with Jarrow and Protter’s (2012) prediction that competition between HFT firms that manifest as more highly correlated HFT activity would increase volatility. Why would greater competition between HFT firms that pursue market-making strategies decrease rather than increase return volatility? At short horizons (e.g., 1-second intervals), volatility is driven predominantly by the price impact of trades (as opposed to the public release of fundamental news, which is relatively rare for a stock), and competition between HFT firms could affect this price impact. For example, Jovanovic and Menkveld (2016) posit that HFT intermediaries incorporate “hard information,” such as price changes in same-industry stocks or the market index, into their algorithms. Greater competition to turn these cross-stock “private” signals into public information implies lower adverse selection costs and hence a lower permanent price impact of trades. The negative loadings could also be driven by the effects of HFT competition on the temporary, rather than the permanent, price impact of trades if they reflect reduced rents from market-making activity or better inventory control. If returns and order flows of various stocks are correlated, efficient market making would necessitate algorithms that consider multiple stocks (e.g., Ho and Stoll 1983). Competition between such market-making algorithms would decrease the temporary price impact of trades (or realized spread), lowering short-horizon volatility. To investigate the channel through which HFT competition affects volatility, we therefore need to consider the price impact of trades.30 Let $$p_{t}$$ be the execution price of a trade, $$m_{t}$$ be the prevailing midquote at the time of the trade execution, and $$m_{t+\tau }$$ be the midquote $$\tau $$ seconds after the trade. The total price impact of a trade has often been decomposed in the market microstructure literature into permanent and temporary components. The permanent price impact is the signed change from the prevailing midquote to the midquote at $$t+\tau $$, defined as ($$m_{t+\tau } - m_{t})$$ multiplied by 1 ($$-$$1) for buys (sells), and is commonly attributed to information asymmetry (or adverse selection). The temporary price impact is the signed change between the trade price and the midquote at $$t+\tau , $$ defined as ($$p_{t} - m_{t+\tau })$$ multiplied by 1 ($$-$$1) for buys (sells), and is often thought of as the reversal needed to compensate liquidity providers (also called the realized spread). The length of time used for the price adjustment ($$\tau $$) has varied historically. SEC rule 11Ac1-5, which became effective in 2001, prescribed computing the realized spread for reporting purposes using $$\tau =5$$ minutes. However, with the advent of HFT algorithms that respond to the market in milliseconds, researchers have been using intervals as short as 5 seconds (e.g., O’Hara, Saar, and Zhong 2017). The permanent or temporary price impact is computed for each trade, and then averaged over all trades (often using volume weighting) to obtain a measure of these price impacts for an aggregation interval, $$T$$, such that $$T >\tau $$. Our paper focuses on very short intervals with lengths of 1, 10, or 60 seconds. To compute measures of permanent or temporary price impacts for a 1-second interval, we need to assume that price adjustments for all trades are completed within the same second (i.e., that $$\tau $$ is much smaller than 1 second), which can be unrealistic.31 We therefore use $$\tau =10$$ seconds for the computation of the price impact of each trade and $$T=60$$ seconds as the aggregation interval to obtain measures of permanent and temporary price impacts.32 To investigate whether the influence of HFT competition on volatility is driven by the manner in which it affects the permanent price impact of trades, we first regress for each stock the return volatility measure on the volume-weighted permanent price impact measure using all 60-second intervals in the sample periods: \[ \left| {r_{it} } \right|=b_{0i} +b_{1i} \textit{PPI}_{it} +\textit{error}_{it} . \] We then take the fitted values from this regression, $$\left| {\hat{{r}}} \right|_{PPI_{it} }$$, which describe the portion of short-horizon volatility that is explained by the permanent price impact, and use them as the dependent variable in the regression specified in Equation (1). Similarly, to investigate whether the negative impact of HFT competition on volatility is driven by a reduction in the temporary price impact of trades, we regress the volatility measure on the volume-weighted temporary price impact measure: \[ \left| {r_{it} } \right|=b_{0i} +b_{1i} \textit{TPI}_{it} +\textit{error}_{it} , \] and use the fitted values ($$\left| {\hat{{r}}} \right|_{\textit{TPI}_{it} } )$$ as a dependent variable for the regression in Equation (1). Panel A of Table 5 shows the results using the permanent price impact fitted values.33 We observe that the only underlying common strategy that exhibits a consistent negative relationship using all three representations of HFT activity (MSG, TRD, and LMT) is the one captured by the second principal component, which we associate with market making. In particular, 45 out of 52 stocks have negative loadings on MktSimPC2 in these stock-by-stock regressions using MSG, and the mean coefficient is highly statistically significant. The loadings on MktSimPC3 do not have a predominant direction, and those on MktSimPC1 show a direction only for TRD, and it is positive rather than negative. Panel B of Table 5 shows the results of the regressions using the temporary price impact fitted values as the dependent variable. We observe a similar albeit weaker picture: the mean coefficient on MktSimPC2 is negative for MSG, LMT, and TRD, but TRD is not statistically significant. Table 5 Price impact portions of short-horizon volatility and HFT competition A. Permanent price impact portion of volatility and HFT competition Avg. CS # coef # $$t$$-stat # coef # $$t$$-stat Avg. R$$^{\mathrm{2}}$$ coef $$t$$-stat $$<$$ 0 $$<-1.96$$ $$>$$ 0 $$>$$ 1.96 (%) MSG Intercept 4.97 × 10-4 22.67 0 0 52 52 4.70 MktSimPC1 9.09 × 10-6 0.61 29 12 23 17 MktSimPC2 –8.27 × 10-5 –7.51 45 33 7 1 MktSimPC3 3.22 × 10-6 0.57 22 9 30 11 MktPC1 1.35 × 10-9 7.00 8 3 44 35 MktPC2 9.67 × 10-12 0.05 25 9 27 9 MktPC3 –1.30 × 10-9 –4.95 39 30 13 5 MktRet 2.97 × 10-4 0.23 25 1 27 0 |MktRet| 9.22 × 10-2 16.07 0 0 52 51 MktVol 9.00× 10-14 1.14 20 6 32 2 TRD Intercept 5.09× 10-4 21.77 0 0 52 52 4.20 MktSimPC1 3.56 × 10-5 2.86 21 3 31 18 MktSimPC2 –4.74 × 10-5 –3.89 39 21 13 3 MktSimPC3 –2.88 × 10-6 –0.48 22 6 30 8 MktPC1 9.01 × 10-8 7.04 4 0 48 29 MktPC2 1.88 × 10-8 2.86 11 5 41 20 MktPC3 –1.43 × 10-7 –6.66 46 30 6 2 MktRet –1.47 × 10-3 –0.89 25 2 27 0 |MktRet| 9.38 × 10-2 13.13 1 0 51 48 MktVol –1.22 × 10-13 –1.15 33 9 19 1 LMT Intercept 4.97 × 10-4 22.87 0 0 52 52 4.74 MktSimPC1 1.35 × 10-5 0.88 29 11 23 18 MktSimPC2 –8.85 × 10-5 –7.71 44 34 8 2 MktSimPC3 3.08 × 10-6 0.55 22 9 30 11 MktPC1 1.30 × 10-9 6.95 8 3 44 34 MktPC2 7.28 × 10-11 0.36 25 7 27 9 MktPC3 –1.25 × 10-9 –4.87 38 27 14 5 MktRet 1.77 × 10-4 0.14 25 1 27 0 |MktRet| 9.19 × 10-2 16.01 0 0 52 51 MktVol 6.85 × 10-14 0.89 23 6 29 2 A. Permanent price impact portion of volatility and HFT competition Avg. CS # coef # $$t$$-stat # coef # $$t$$-stat Avg. R$$^{\mathrm{2}}$$ coef $$t$$-stat $$<$$ 0 $$<-1.96$$ $$>$$ 0 $$>$$ 1.96 (%) MSG Intercept 4.97 × 10-4 22.67 0 0 52 52 4.70 MktSimPC1 9.09 × 10-6 0.61 29 12 23 17 MktSimPC2 –8.27 × 10-5 –7.51 45 33 7 1 MktSimPC3 3.22 × 10-6 0.57 22 9 30 11 MktPC1 1.35 × 10-9 7.00 8 3 44 35 MktPC2 9.67 × 10-12 0.05 25 9 27 9 MktPC3 –1.30 × 10-9 –4.95 39 30 13 5 MktRet 2.97 × 10-4 0.23 25 1 27 0 |MktRet| 9.22 × 10-2 16.07 0 0 52 51 MktVol 9.00× 10-14 1.14 20 6 32 2 TRD Intercept 5.09× 10-4 21.77 0 0 52 52 4.20 MktSimPC1 3.56 × 10-5 2.86 21 3 31 18 MktSimPC2 –4.74 × 10-5 –3.89 39 21 13 3 MktSimPC3 –2.88 × 10-6 –0.48 22 6 30 8 MktPC1 9.01 × 10-8 7.04 4 0 48 29 MktPC2 1.88 × 10-8 2.86 11 5 41 20 MktPC3 –1.43 × 10-7 –6.66 46 30 6 2 MktRet –1.47 × 10-3 –0.89 25 2 27 0 |MktRet| 9.38 × 10-2 13.13 1 0 51 48 MktVol –1.22 × 10-13 –1.15 33 9 19 1 LMT Intercept 4.97 × 10-4 22.87 0 0 52 52 4.74 MktSimPC1 1.35 × 10-5 0.88 29 11 23 18 MktSimPC2 –8.85 × 10-5 –7.71 44 34 8 2 MktSimPC3 3.08 × 10-6 0.55 22 9 30 11 MktPC1 1.30 × 10-9 6.95 8 3 44 34 MktPC2 7.28 × 10-11 0.36 25 7 27 9 MktPC3 –1.25 × 10-9 –4.87 38 27 14 5 MktRet 1.77 × 10-4 0.14 25 1 27 0 |MktRet| 9.19 × 10-2 16.01 0 0 52 51 MktVol 6.85 × 10-14 0.89 23 6 29 2 B. Temporary price impact portion of volatility and HFT competition MSG Intercept 5.34 × 10-4 22.83 0 0 52 52 1.10 MktSimPC1 4.30 × 10-6 0.75 20 6 32 9 MktSimPC2 –1.21 × 10-5 –3.14 38 16 14 1 MktSimPC3 –1.85 × 10-6 –0.62 29 9 23 4 MktPC1 7.61 × 10-11 1.76 16 3 36 17 MktPC2 2.08 × 10-10 2.80 23 1 29 9 MktPC3 –3.51 × 10-11 –0.66 36 16 16 5 MktRet 1.33 × 10-4 0.20 26 11 26 3 |MktRet| 2.32 × 10-2 4.11 15 0 37 21 MktVol –2.97 × 10-14 –0.70 37 7 15 2 B. Temporary price impact portion of volatility and HFT competition MSG Intercept 5.34 × 10-4 22.83 0 0 52 52 1.10 MktSimPC1 4.30 × 10-6 0.75 20 6 32 9 MktSimPC2 –1.21 × 10-5 –3.14 38 16 14 1 MktSimPC3 –1.85 × 10-6 –0.62 29 9 23 4 MktPC1 7.61 × 10-11 1.76 16 3 36 17 MktPC2 2.08 × 10-10 2.80 23 1 29 9 MktPC3 –3.51 × 10-11 –0.66 36 16 16 5 MktRet 1.33 × 10-4 0.20 26 11 26 3 |MktRet| 2.32 × 10-2 4.11 15 0 37 21 MktVol –2.97 × 10-14 –0.70 37 7 15 2 B. Temporary price impact portion of volatility and HFT competition Avg. CS # coef # $$t$$-stat # coef # $$t$$-stat Avg. R$$^{\mathrm{2}}$$ coef $$t$$-stat $$<$$ 0 $$<-1.96$$ $$>$$ 0 $$>$$ 1.96 (%) TRD Intercept 5.35 × 10-4 22.76 0 0 52 52 1.01 MktSimPC1 1.07 × 10-5 2.40 13 1 39 16 MktSimPC2 –2.73 × 10-6 –1.04 29 10 23 1 MktSimPC3 –1.41 × 10-6 –0.63 22 4 30 7 MktPC1 7.59 × 10-9 1.64 12 2 40 13 MktPC2 –7.93 × 10-10 –0.38 26 6 26 3 MktPC3 –1.43 × 10-9 –0.22 36 14 16 5 MktRet –1.68 × 10-4 –0.24 27 9 25 4 |MktRet| 2.51 × 10-2 4.19 15 0 37 22 MktVol –8.35 × 10-14 –1.50 34 4 18 0 LMT Intercept 5.34 × 10-4 22.86 0 0 52 52 1.10 MktSimPC1 4.26 × 10-6 0.77 18 6 34 9 MktSimPC2 –1.28 × 10-5 –3.21 38 16 14 1 MktSimPC3 –1.77 × 10-6 –0.59 27 8 25 3 MktPC1 7.00 × 10-11 1.64 17 3 35 16 MktPC2 2.19 × 10-10 2.93 23 0 29 9 MktPC3 –2.89 × 10-11 –0.55 35 14 17 5 MktRet 1.16 × 10-4 0.18 26 11 26 3 |MktRet| 2.31 × 10-2 4.11 15 0 37 22 MktVol –3.20 × 10-14 –0.77 37 7 15 2 B. Temporary price impact portion of volatility and HFT competition Avg. CS # coef # $$t$$-stat # coef # $$t$$-stat Avg. R$$^{\mathrm{2}}$$ coef $$t$$-stat $$<$$ 0 $$<-1.96$$ $$>$$ 0 $$>$$ 1.96 (%) TRD Intercept 5.35 × 10-4 22.76 0 0 52 52 1.01 MktSimPC1 1.07 × 10-5 2.40 13 1 39 16 MktSimPC2 –2.73 × 10-6 –1.04 29 10 23 1 MktSimPC3 –1.41 × 10-6 –0.63 22 4 30 7 MktPC1 7.59 × 10-9 1.64 12 2 40 13 MktPC2 –7.93 × 10-10 –0.38 26 6 26 3 MktPC3 –1.43 × 10-9 –0.22 36 14 16 5 MktRet –1.68 × 10-4 –0.24 27 9 25 4 |MktRet| 2.51 × 10-2 4.19 15 0 37 22 MktVol –8.35 × 10-14 –1.50 34 4 18 0 LMT Intercept 5.34 × 10-4 22.86 0 0 52 52 1.10 MktSimPC1 4.26 × 10-6 0.77 18 6 34 9 MktSimPC2 –1.28 × 10-5 –3.21 38 16 14 1 MktSimPC3 –1.77 × 10-6 –0.59 27 8 25 3 MktPC1 7.00 × 10-11 1.64 17 3 35 16 MktPC2 2.19 × 10-10 2.93 23 0 29 9 MktPC3 –2.89 × 10-11 –0.55 35 14 17 5 MktRet 1.16 × 10-4 0.18 26 11 26 3 |MktRet| 2.31 × 10-2 4.11 15 0 37 22 MktVol –3.20 × 10-14 –0.77 37 7 15 2 This table presents results on the channel through which high-frequency trading (HFT) competition affects volatility by isolating the portions of short-horizon volatility explained by the permanent and temporary price impacts of trades. Let $$p_{t}$$ be the execution price of a trade, $$m_{t}$$ be the prevailing midquote at the time of the trade execution, and $$m_{t+\tau }$$ be the midquote $$\tau $$ seconds after the trade. The permanent price impact is the signed change from the prevailing midquote to the midquote at $$t+\tau $$, defined as ($$m_{t+\tau } - m_{t})$$ multiplied by 1 ($$-$$1) for buys (sells), and is commonly attributed to information asymmetry (or adverse selection). The temporary price impact is the signed change between the trade price and the midquote at $$t+\tau$$, defined as ($$p_{t} - m_{t+\tau })$$ multiplied by 1 ($$-$$1) for buys (sells), and is often thought of as the reversal needed to compensate liquidity providers (also called the realized spread). We compute the permanent or temporary price impacts for each trade using $$\tau =10$$ seconds and then average them (using volume weighting) to obtain a measure of these price impacts for an aggregation interval of $$T=60$$ seconds. To investigate whether the influence of HFT competition on volatility is driven by the manner in which it affects the permanent price impact of trades, we first regress for each stock the return volatility measure on the volume-weighted permanent price impact for all 60-second intervals in the sample periods: \[ \left| {r_{\textit{it}} } \right|=b_{0i} +b_{1i} \textit{PPI}_{\textit{it}} +\textit{error}_{\textit{it}}. \] We then use the fitted values from this regression, $$\left| {\hat{{r}}} \right|_{\textit{PPI}_{it} } $$, as the dependent variable in the following regression: \begin{align*} \left| {\hat{{r}}} \right|_{\textit{PPI}_{\textit{it}} }&= a_{0i} +a_{1i} \textit{MktSimPC}1_{t} +a_{2i} \textit{MktSimPC}2_{t} +a_{3i} \textit{MktSimPC}3_{t} +a_{4i} \textit{MktPC}1_{t}\\&\quad + a_{5i} \textit{MktPC}2_{t} +a_{6i} \textit{MktPC}3_{t} +a_{7i} \textit{MktRet}_{t} +a_{8i} \left| {\textit{MktRet}_{t} } \right|+a_{9i} \textit{MktVol}_{t} + \textit{error}_{\textit{it}} , \end{align*} where the independent variables are defined as in Table 5. Similarly, to investigate whether the negative impact of HFT competition on volatility is driven by a reduction in the temporary price impact of trades, we regress the volatility measure on the volume-weighted temporary price impact: \[ \left| {r_{\textit{it}} } \right|=b_{0i} +b_{1i} \textit{TPI}_{\textit{it}} +\textit{error}_{\textit{it}} , \] and use the fitted values ($$\left| {\hat{{r}}} \right|_{\textit{TPI}_{it} } )$$ as the dependent variable. We present the regression analysis for the permanent price impact fitted values (in panel A) and the temporary price impact fitted values (in panel B) using three variables that represent HFT strategies: MSG (all messages HFT firms send to the market), TRD (all their trades), and LMT (submissions and cancellations of nonmarketable limit orders). The table presents for each independent variable the average coefficient across the 52 stocks, $$t$$-statistics computed from the cross-sectional distribution of the regression coefficients, the number of negative coefficients, the number of negative coefficients that are significant at the 5% level (from a two-sided test), the number of positive coefficients, the number of significant positive coefficients, and the average R$$^{\mathrm{2}}$$. Table 5 Price impact portions of short-horizon volatility and HFT competition A. Permanent price impact portion of volatility and HFT competition Avg. CS # coef # $$t$$-stat # coef # $$t$$-stat Avg. R$$^{\mathrm{2}}$$ coef $$t$$-stat $$<$$ 0 $$<-1.96$$ $$>$$ 0 $$>$$ 1.96 (%) MSG Intercept 4.97 × 10-4 22.67 0 0 52 52 4.70 MktSimPC1 9.09 × 10-6 0.61 29 12 23 17 MktSimPC2 –8.27 × 10-5 –7.51 45 33 7 1 MktSimPC3 3.22 × 10-6 0.57 22 9 30 11 MktPC1 1.35 × 10-9 7.00 8 3 44 35 MktPC2 9.67 × 10-12 0.05 25 9 27 9 MktPC3 –1.30 × 10-9 –4.95 39 30 13 5 MktRet 2.97 × 10-4 0.23 25 1 27 0 |MktRet| 9.22 × 10-2 16.07 0 0 52 51 MktVol 9.00× 10-14 1.14 20 6 32 2 TRD Intercept 5.09× 10-4 21.77 0 0 52 52 4.20 MktSimPC1 3.56 × 10-5 2.86 21 3 31 18 MktSimPC2 –4.74 × 10-5 –3.89 39 21 13 3 MktSimPC3 –2.88 × 10-6 –0.48 22 6 30 8 MktPC1 9.01 × 10-8 7.04 4 0 48 29 MktPC2 1.88 × 10-8 2.86 11 5 41 20 MktPC3 –1.43 × 10-7 –6.66 46 30 6 2 MktRet –1.47 × 10-3 –0.89 25 2 27 0 |MktRet| 9.38 × 10-2 13.13 1 0 51 48 MktVol –1.22 × 10-13 –1.15 33 9 19 1 LMT Intercept 4.97 × 10-4 22.87 0 0 52 52 4.74 MktSimPC1 1.35 × 10-5 0.88 29 11 23 18 MktSimPC2 –8.85 × 10-5 –7.71 44 34 8 2 MktSimPC3 3.08 × 10-6 0.55 22 9 30 11 MktPC1 1.30 × 10-9 6.95 8 3 44 34 MktPC2 7.28 × 10-11 0.36 25 7 27 9 MktPC3 –1.25 × 10-9 –4.87 38 27 14 5 MktRet 1.77 × 10-4 0.14 25 1 27 0 |MktRet| 9.19 × 10-2 16.01 0 0 52 51 MktVol 6.85 × 10-14 0.89 23 6 29 2 A. Permanent price impact portion of volatility and HFT competition Avg. CS # coef # $$t$$-stat # coef # $$t$$-stat Avg. R$$^{\mathrm{2}}$$ coef $$t$$-stat $$<$$ 0 $$<-1.96$$ $$>$$ 0 $$>$$ 1.96 (%) MSG Intercept 4.97 × 10-4 22.67 0 0 52 52 4.70 MktSimPC1 9.09 × 10-6 0.61 29 12 23 17 MktSimPC2 –8.27 × 10-5 –7.51 45 33 7 1 MktSimPC3 3.22 × 10-6 0.57 22 9 30 11 MktPC1 1.35 × 10-9 7.00 8 3 44 35 MktPC2 9.67 × 10-12 0.05 25 9 27 9 MktPC3 –1.30 × 10-9 –4.95 39 30 13 5 MktRet 2.97 × 10-4 0.23 25 1 27 0 |MktRet| 9.22 × 10-2 16.07 0 0 52 51 MktVol 9.00× 10-14 1.14 20 6 32 2 TRD Intercept 5.09× 10-4 21.77 0 0 52 52 4.20 MktSimPC1 3.56 × 10-5 2.86 21 3 31 18 MktSimPC2 –4.74 × 10-5 –3.89 39 21 13 3 MktSimPC3 –2.88 × 10-6 –0.48 22 6 30 8 MktPC1 9.01 × 10-8 7.04 4 0 48 29 MktPC2 1.88 × 10-8 2.86 11 5 41 20 MktPC3 –1.43 × 10-7 –6.66 46 30 6 2 MktRet –1.47 × 10-3 –0.89 25 2 27 0 |MktRet| 9.38 × 10-2 13.13 1 0 51 48 MktVol –1.22 × 10-13 –1.15 33 9 19 1 LMT Intercept 4.97 × 10-4 22.87 0 0 52 52 4.74 MktSimPC1 1.35 × 10-5 0.88 29 11 23 18 MktSimPC2 –8.85 × 10-5 –7.71 44 34 8 2 MktSimPC3 3.08 × 10-6 0.55 22 9 30 11 MktPC1 1.30 × 10-9 6.95 8 3 44 34 MktPC2 7.28 × 10-11 0.36 25 7 27 9 MktPC3 –1.25 × 10-9 –4.87 38 27 14 5 MktRet 1.77 × 10-4 0.14 25 1 27 0 |MktRet| 9.19 × 10-2 16.01 0 0 52 51 MktVol 6.85 × 10-14 0.89 23 6 29 2 B. Temporary price impact portion of volatility and HFT competition MSG Intercept 5.34 × 10-4 22.83 0 0 52 52 1.10 MktSimPC1 4.30 × 10-6 0.75 20 6 32 9 MktSimPC2 –1.21 × 10-5 –3.14 38 16 14 1 MktSimPC3 –1.85 × 10-6 –0.62 29 9 23 4 MktPC1 7.61 × 10-11 1.76 16 3 36 17 MktPC2 2.08 × 10-10 2.80 23 1 29 9 MktPC3 –3.51 × 10-11 –0.66 36 16 16 5 MktRet 1.33 × 10-4 0.20 26 11 26 3 |MktRet| 2.32 × 10-2 4.11 15 0 37 21 MktVol –2.97 × 10-14 –0.70 37 7 15 2 B. Temporary price impact portion of volatility and HFT competition MSG Intercept 5.34 × 10-4 22.83 0 0 52 52 1.10 MktSimPC1 4.30 × 10-6 0.75 20 6 32 9 MktSimPC2 –1.21 × 10-5 –3.14 38 16 14 1 MktSimPC3 –1.85 × 10-6 –0.62 29 9 23 4 MktPC1 7.61 × 10-11 1.76 16 3 36 17 MktPC2 2.08 × 10-10 2.80 23 1 29 9 MktPC3 –3.51 × 10-11 –0.66 36 16 16 5 MktRet 1.33 × 10-4 0.20 26 11 26 3 |MktRet| 2.32 × 10-2 4.11 15 0 37 21 MktVol –2.97 × 10-14 –0.70 37 7 15 2 B. Temporary price impact portion of volatility and HFT competition Avg. CS # coef # $$t$$-stat # coef # $$t$$-stat Avg. R$$^{\mathrm{2}}$$ coef $$t$$-stat $$<$$ 0 $$<-1.96$$ $$>$$ 0 $$>$$ 1.96 (%) TRD Intercept 5.35 × 10-4 22.76 0 0 52 52 1.01 MktSimPC1 1.07 × 10-5 2.40 13 1 39 16 MktSimPC2 –2.73 × 10-6 –1.04 29 10 23 1 MktSimPC3 –1.41 × 10-6 –0.63 22 4 30 7 MktPC1 7.59 × 10-9 1.64 12 2 40 13 MktPC2 –7.93 × 10-10 –0.38 26 6 26 3 MktPC3 –1.43 × 10-9 –0.22 36 14 16 5 MktRet –1.68 × 10-4 –0.24 27 9 25 4 |MktRet| 2.51 × 10-2 4.19 15 0 37 22 MktVol –8.35 × 10-14 –1.50 34 4 18 0 LMT Intercept 5.34 × 10-4 22.86 0 0 52 52 1.10 MktSimPC1 4.26 × 10-6 0.77 18 6 34 9 MktSimPC2 –1.28 × 10-5 –3.21 38 16 14 1 MktSimPC3 –1.77 × 10-6 –0.59 27 8 25 3 MktPC1 7.00 × 10-11 1.64 17 3 35 16 MktPC2 2.19 × 10-10 2.93 23 0 29 9 MktPC3 –2.89 × 10-11 –0.55 35 14 17 5 MktRet 1.16 × 10-4 0.18 26 11 26 3 |MktRet| 2.31 × 10-2 4.11 15 0 37 22 MktVol –3.20 × 10-14 –0.77 37 7 15 2 B. Temporary price impact portion of volatility and HFT competition Avg. CS # coef # $$t$$-stat # coef # $$t$$-stat Avg. R$$^{\mathrm{2}}$$ coef $$t$$-stat $$<$$ 0 $$<-1.96$$ $$>$$ 0 $$>$$ 1.96 (%) TRD Intercept 5.35 × 10-4 22.76 0 0 52 52 1.01 MktSimPC1 1.07 × 10-5 2.40 13 1 39 16 MktSimPC2 –2.73 × 10-6 –1.04 29 10 23 1 MktSimPC3 –1.41 × 10-6 –0.63 22 4 30 7 MktPC1 7.59 × 10-9 1.64 12 2 40 13 MktPC2 –7.93 × 10-10 –0.38 26 6 26 3 MktPC3 –1.43 × 10-9 –0.22 36 14 16 5 MktRet –1.68 × 10-4 –0.24 27 9 25 4 |MktRet| 2.51 × 10-2 4.19 15 0 37 22 MktVol –8.35 × 10-14 –1.50 34 4 18 0 LMT Intercept 5.34 × 10-4 22.86 0 0 52 52 1.10 MktSimPC1 4.26 × 10-6 0.77 18 6 34 9 MktSimPC2 –1.28 × 10-5 –3.21 38 16 14 1 MktSimPC3 –1.77 × 10-6 –0.59 27 8 25 3 MktPC1 7.00 × 10-11 1.64 17 3 35 16 MktPC2 2.19 × 10-10 2.93 23 0 29 9 MktPC3 –2.89 × 10-11 –0.55 35 14 17 5 MktRet 1.16 × 10-4 0.18 26 11 26 3 |MktRet| 2.31 × 10-2 4.11 15 0 37 22 MktVol –3.20 × 10-14 –0.77 37 7 15 2 This table presents results on the channel through which high-frequency trading (HFT) competition affects volatility by isolating the portions of short-horizon volatility explained by the permanent and temporary price impacts of trades. Let $$p_{t}$$ be the execution price of a trade, $$m_{t}$$ be the prevailing midquote at the time of the trade execution, and $$m_{t+\tau }$$ be the midquote $$\tau $$ seconds after the trade. The permanent price impact is the signed change from the prevailing midquote to the midquote at $$t+\tau $$, defined as ($$m_{t+\tau } - m_{t})$$ multiplied by 1 ($$-$$1) for buys (sells), and is commonly attributed to information asymmetry (or adverse selection). The temporary price impact is the signed change between the trade price and the midquote at $$t+\tau$$, defined as ($$p_{t} - m_{t+\tau })$$ multiplied by 1 ($$-$$1) for buys (sells), and is often thought of as the reversal needed to compensate liquidity providers (also called the realized spread). We compute the permanent or temporary price impacts for each trade using $$\tau =10$$ seconds and then average them (using volume weighting) to obtain a measure of these price impacts for an aggregation interval of $$T=60$$ seconds. To investigate whether the influence of HFT competition on volatility is driven by the manner in which it affects the permanent price impact of trades, we first regress for each stock the return volatility measure on the volume-weighted permanent price impact for all 60-second intervals in the sample periods: \[ \left| {r_{\textit{it}} } \right|=b_{0i} +b_{1i} \textit{PPI}_{\textit{it}} +\textit{error}_{\textit{it}}. \] We then use the fitted values from this regression, $$\left| {\hat{{r}}} \right|_{\textit{PPI}_{it} } $$, as the dependent variable in the following regression: \begin{align*} \left| {\hat{{r}}} \right|_{\textit{PPI}_{\textit{it}} }&= a_{0i} +a_{1i} \textit{MktSimPC}1_{t} +a_{2i} \textit{MktSimPC}2_{t} +a_{3i} \textit{MktSimPC}3_{t} +a_{4i} \textit{MktPC}1_{t}\\&\quad + a_{5i} \textit{MktPC}2_{t} +a_{6i} \textit{MktPC}3_{t} +a_{7i} \textit{MktRet}_{t} +a_{8i} \left| {\textit{MktRet}_{t} } \right|+a_{9i} \textit{MktVol}_{t} + \textit{error}_{\textit{it}} , \end{align*} where the independent variables are defined as in Table 5. Similarly, to investigate whether the negative impact of HFT competition on volatility is driven by a reduction in the temporary price impact of trades, we regress the volatility measure on the volume-weighted temporary price impact: \[ \left| {r_{\textit{it}} } \right|=b_{0i} +b_{1i} \textit{TPI}_{\textit{it}} +\textit{error}_{\textit{it}} , \] and use the fitted values ($$\left| {\hat{{r}}} \right|_{\textit{TPI}_{it} } )$$ as the dependent variable. We present the regression analysis for the permanent price impact fitted values (in panel A) and the temporary price impact fitted values (in panel B) using three variables that represent HFT strategies: MSG (all messages HFT firms send to the market), TRD (all their trades), and LMT (submissions and cancellations of nonmarketable limit orders). The table presents for each independent variable the average coefficient across the 52 stocks, $$t$$-statistics computed from the cross-sectional distribution of the regression coefficients, the number of negative coefficients, the number of negative coefficients that are significant at the 5% level (from a two-sided test), the number of positive coefficients, the number of significant positive coefficients, and the average R$$^{\mathrm{2}}$$. How do these tests help us interpret the volatility results? Table 4 shows that short-horizon volatility loads negatively on competition between HFT firms that pursue the underlying common strategy that we associate with market making. Market makers affect price formation through their liquidity provision, which often determines the price impact of trades. We find that the portions of volatility explained by both the permanent and temporary price impacts appear to drive the negative relationship with HFT competition. The strong result with respect to the permanent price impact can be viewed in the context of the intermediation model in Jovanovic and Menkveld (2016): greater competition means hard information signals are revealed more quickly, and there is therefore less adverse selection in the trading environment. The temporary price impact result we obtain suggests that lower volatility could reflect competition between market makers reducing the compensation they earn (the realized spread). To summarize, we find that the short-horizon volatility of most stocks loads negatively on a measure of similarity in the strategies of HFT firms that follow the underlying common strategy we associate with market making. This is contrary to concerns that correlated HFT activity may increase market fragility. Therefore, competition between HFT firms, especially those that pursue market making, could benefit the market in more than one way: not just through the possible reduction in economic rents but also through a reduction in short-horizon volatility. It is important to stress, though, that this result is observed during “normal” times, and may not hold during a market breakdown episode like the U.S. Flash Crash in May of 2010. 4.2 HFT competition and market concentration Equity trading markets in the U.S., Canada, and many other countries are characterized by multiple trading venues on which stocks can be traded. Such trading fragmentation may create negative externalities in the form of lower price integrity and higher costs as liquidity is scattered across trading venues. Against these negative externalities, proponents of this structure argue that the competition it induces between trading venues results in lower fees and greater innovation in terms of trading technology and services. HFT is pertinent to the debate on market fragmentation. Brokers route customer orders to trading venues using routing tables that take into account multiple parameters, including past execution quality as well as rebates and fees. Under the best execution requirement of the Universal Market Integrity Rules (UMIR) in Canada, however, brokers need to make reasonable efforts to ensure that customer orders receive the best prices. This was formalized during our sample period in the “Best Price Obligation” (UMIR Section 5.2). Therefore, a key criterion in the routing of each order necessitates determining which trading venues display the best price when the order arrives, and brokers are required to take into account information from all venues on which a security is traded.34 As such, trading venues that display the best prices or the smallest spreads attract more customer order flow from brokers. Given the central role HFT firms play in setting prices and spreads, a trading venue’s market share depends on these firms’ activity. In particular, HFT liquidity provision on new trading venues can be essential to fostering competition between venues.35 Fragmentation of the trading environment among multiple trading venues could also create profit opportunities for HFT firms in moving liquidity from one venue to another and ensuring that prices are the same across the venues. Hence, trading venue proliferation and HFT activity are interrelated. What is less well understood, however, is how competition between HFT firms, which is reflected in the pursuit of similar strategies by multiple firms, affects the competitiveness of trading venues and the concentration of trading in the market. This is the question we explore in this section. To examine market concentration of trading volume in stock $$i$$, we compute the Herfindahl-Hirschman Index (HHI$$_{i})$$ of market share for the five trading venues: Alpha ATS Limited Partnership (ALF), Chi-X Canada ATS (CHX), Omega ATS (OMG), Pure Trading (PTX), and the Toronto Stock Exchange (TSX). The lower the HHI$$_{i}$$, the less concentrated the market.36 To construct a measure of similarity in HFT strategies for each stock, we compute the correlation coefficient between the activities of pairs of HFT firms over all time intervals, and average the correlations for all pairs of HFT firms in each product category. If MSG$$_{\it itk} $$ is the message activity of HFT firm $$k$$ in stock $$i$$ and interval $$t$$, then Cor(MSG$$_{\mathrm{k}}$$,MSG$$_{\mathrm{l}})_{\mathrm{i}}$$ is the correlation between the activities of HFT firms $$k$$ and $$l$$ in stock $$i$$ computed over all intervals. There are $$0.5*8*(8-1)=28$$ such correlations for a group of eight HFT firms that follow the underlying common strategy represented by the first principal component, and our stock-specific similarity measure, StockSimPC1$$_{i}$$, is computed as the average of these correlations for the 28 pairs. We compute StockSimPC2$$_{i}$$ and StockSimPC3$$_{i}$$ in a similar fashion for the groups of firms that follow the underlying common strategies we associate with the second and third principal components, respectively. It is important to note that the relationship between these two variables—market concentration and HFT competition—is best viewed as being jointly determined in equilibrium rather than as one causing the other. While competition between HFT firms could enhance liquidity provision and hence the viability of smaller trading venues, the multiplicity of trading venues could also increase the profit opportunities from arbitrage activity and market making and lead to intensified competition between HFT firms. Therefore, we examine the cross-sectional association between HHI$$_{i}$$ and the measures of similarity in HFT strategies without assuming causality. Still, we would like to control for several variables that could influence the level of concentration in a stock. Specifically, we use three variables—market capitalization, price level, and the standard deviation of 30-minute returns over the sample period—to control for heterogeneity in fundamental attributes across stocks. We employ two additional variables, time-weighted average spread and market-wide bid-and-offer depth, to control for the liquidity environment of a stock. Lastly, since our goal is to study the effect of similarity in HFT strategies rather than the magnitude of HFT activity per se, we add the magnitude of HFT activity in stock $$i$$ as a control variable.37 Table 6 presents the partial correlations between HHI and the HFT similarity measures controlling for market capitalization, price level, volatility, spread, depth, and the magnitude of HFT activity.38 The results are presented side-by-side for three HFT activity variables (MSG, TRD, and LMT) and three interval lengths (1, 10, and 60 seconds). We observe very intuitive results: the partial correlations between HHI and StockSimPC1 are negative and highly statistically significant in all permutations, and there is a similar strong negative relationship between HHI and StockSimPC2. If one is willing to accept our observation in Section 3 that the first principal component is related to cross-venue arbitrage, then we would expect competition in this underlying common strategy to be negatively related to market concentration because it improves the quality of prices displayed on the smaller trading venues. Competition between HFT firms that follow the second principal component, which we associate with market making, should exhibit a strong negative relationship with market concentration if competition in liquidity provision on smaller venues is important to attracting order flow from brokers. Competition between HFT firms and competition between trading venues appear tightly linked in a “competition begets competition” manner. Table 6 Market concentration and HFT competition MSG TRD LMT I1 I2 I3 I1 I2 I3 I1 I2 I3 StockSimPC1 –0.539 –0.587 –0.689 –0.375 –0.424 –0.535 –0.539 –0.588 –0.686 (< .001) (< .001) (< .001) (0.010) (0.003) (< .001) (< .001) (< .001) (< .001) StockSimPC2 –0.564 –0.483 –0.490 –0.502 –0.422 –0.500 –0.554 –0.500 –0.536 (< .001) (0.001) (0.001) (< .001) (0.004) (< .001) (< .001) (< .001) (< .001) StockSimPC3 0.031 –0.086 –0.150 –0.075 –0.111 –0.038 0.022 –0.083 –0.152 (0.840) (0.571) (0.319) (0.622) (0.462) (0.802) (0.885) (0.581) (0.313) MSG TRD LMT I1 I2 I3 I1 I2 I3 I1 I2 I3 StockSimPC1 –0.539 –0.587 –0.689 –0.375 –0.424 –0.535 –0.539 –0.588 –0.686 (< .001) (< .001) (< .001) (0.010) (0.003) (< .001) (< .001) (< .001) (< .001) StockSimPC2 –0.564 –0.483 –0.490 –0.502 –0.422 –0.500 –0.554 –0.500 –0.536 (< .001) (0.001) (0.001) (< .001) (0.004) (< .001) (< .001) (< .001) (< .001) StockSimPC3 0.031 –0.086 –0.150 –0.075 –0.111 –0.038 0.022 –0.083 –0.152 (0.840) (0.571) (0.319) (0.622) (0.462) (0.802) (0.885) (0.581) (0.313) This table presents the results of an analysis that relates concentration of trading across the trading venues to competition between high-frequency trading (HFT) firms in three product categories (or underlying common strategies). The five trading venues we investigate are organized as electronic limit-order books and together execute approximately 97.7% of the trading volume in Canada during our sample period. Our sample consists of 52 stocks from the S&P/TSX 60 (S&P). To examine market concentration, we compute the Herfindahl-Hirschman index (HHI$$_{i})$$ of market share of the five trading venues for each stock. The HHI is computed as the sum of squared market shares of the trading venues, and the lower the HHI, the less concentrated the market. We use data from the Investment Industry Regulatory Organization of Canada (IIROC) to identify 31 high-frequency trading (HFT) firms. To represent HFT competition we construct a measure of similarity in strategies of the HFT firms in each of the product categories. To construct the measure for stock $$i$$ for the first principal component (StockSimPC1), we compute the correlation coefficient between the activities of pairs of HFT firms in stock $$i$$ over all time intervals, and average across all pairs of the eight firms. We construct the measures for the firms that load significantly on the second and third principal components (StockSimPC2 and StockSimPC3, respectively) in an analogous fashion. We examine three representations of HFT activity: (1) the number of “messages” (MSG) HFT firms send to the market, where messages are defined as submissions and cancellations of nonmarketable limit orders as well as executions of marketable limit orders, (2) trades (TRD), and (3) submissions/cancellations of nonmarketable limit orders (LMT). All measures are computed separately for three interval lengths: I1 (1-second intervals), I2 (10-second intervals), and I3 (60-second intervals). The table shows the partial correlations between HHI and the three similarity measures when we control for market capitalization, price level, volatility, spread, depth, and the magnitude of activity of the same HFT firms that are included in the similarity measure. We report $$p$$-values for the (two-sided) hypothesis that the partial correlation is equal to zero in parentheses. Table 6 Market concentration and HFT competition MSG TRD LMT I1 I2 I3 I1 I2 I3 I1 I2 I3 StockSimPC1 –0.539 –0.587 –0.689 –0.375 –0.424 –0.535 –0.539 –0.588 –0.686 (< .001) (< .001) (< .001) (0.010) (0.003) (< .001) (< .001) (< .001) (< .001) StockSimPC2 –0.564 –0.483 –0.490 –0.502 –0.422 –0.500 –0.554 –0.500 –0.536 (< .001) (0.001) (0.001) (< .001) (0.004) (< .001) (< .001) (< .001) (< .001) StockSimPC3 0.031 –0.086 –0.150 –0.075 –0.111 –0.038 0.022 –0.083 –0.152 (0.840) (0.571) (0.319) (0.622) (0.462) (0.802) (0.885) (0.581) (0.313) MSG TRD LMT I1 I2 I3 I1 I2 I3 I1 I2 I3 StockSimPC1 –0.539 –0.587 –0.689 –0.375 –0.424 –0.535 –0.539 –0.588 –0.686 (< .001) (< .001) (< .001) (0.010) (0.003) (< .001) (< .001) (< .001) (< .001) StockSimPC2 –0.564 –0.483 –0.490 –0.502 –0.422 –0.500 –0.554 –0.500 –0.536 (< .001) (0.001) (0.001) (< .001) (0.004) (< .001) (< .001) (< .001) (< .001) StockSimPC3 0.031 –0.086 –0.150 –0.075 –0.111 –0.038 0.022 –0.083 –0.152 (0.840) (0.571) (0.319) (0.622) (0.462) (0.802) (0.885) (0.581) (0.313) This table presents the results of an analysis that relates concentration of trading across the trading venues to competition between high-frequency trading (HFT) firms in three product categories (or underlying common strategies). The five trading venues we investigate are organized as electronic limit-order books and together execute approximately 97.7% of the trading volume in Canada during our sample period. Our sample consists of 52 stocks from the S&P/TSX 60 (S&P). To examine market concentration, we compute the Herfindahl-Hirschman index (HHI$$_{i})$$ of market share of the five trading venues for each stock. The HHI is computed as the sum of squared market shares of the trading venues, and the lower the HHI, the less concentrated the market. We use data from the Investment Industry Regulatory Organization of Canada (IIROC) to identify 31 high-frequency trading (HFT) firms. To represent HFT competition we construct a measure of similarity in strategies of the HFT firms in each of the product categories. To construct the measure for stock $$i$$ for the first principal component (StockSimPC1), we compute the correlation coefficient between the activities of pairs of HFT firms in stock $$i$$ over all time intervals, and average across all pairs of the eight firms. We construct the measures for the firms that load significantly on the second and third principal components (StockSimPC2 and StockSimPC3, respectively) in an analogous fashion. We examine three representations of HFT activity: (1) the number of “messages” (MSG) HFT firms send to the market, where messages are defined as submissions and cancellations of nonmarketable limit orders as well as executions of marketable limit orders, (2) trades (TRD), and (3) submissions/cancellations of nonmarketable limit orders (LMT). All measures are computed separately for three interval lengths: I1 (1-second intervals), I2 (10-second intervals), and I3 (60-second intervals). The table shows the partial correlations between HHI and the three similarity measures when we control for market capitalization, price level, volatility, spread, depth, and the magnitude of activity of the same HFT firms that are included in the similarity measure. We report $$p$$-values for the (two-sided) hypothesis that the partial correlation is equal to zero in parentheses. We are not studying in this paper how trading venues seek to affect the brokers’ routing tables to attract marketable orders. Rather, we study how competition between HFT firms could affect the market share of trading venues using the link created by the brokers’ best execution obligation between the market share of trading venues and the extent to which these venues post the best prices. If HFT competition gives rise to better prices on smaller trading venues, it could in principle help these venues increase their market share and thereby reduce concentration in the market. To examine this channel, we investigate whether greater similarity in HFT strategies on each of the five trading venues increases the percentage of time it displays the best prices or narrowest spreads. Panel A of Table 7 presents summary statistics on the market share of the five trading venues and two measures of their competitiveness (or viability): (1) the percentage of time that the trading venue posts either the best bid or the best ask in the market (where the market is defined as the aggregation of all five trading venues) and (2) the percentage of time that the bid-ask spread on the trading venue is the narrowest spread in the market.39 We observe a dominant trading venue in Canada with a market share of 69.26% that displays the best price 92% of the time (averaged across all stocks in our sample). Similarly, the largest trading venue has the narrowest spread 76.8% of the time (compared with 54.2%, 36.1%, 26.9%, and 9.4% for the other four trading venues). Table 7 Trading venue competitiveness and the similarity in HFT strategies A. Summary statistics for market share and competitiveness measures (N $$=$$ 52) A B C D E %Market share of volume Mean 14.3 11.8 0.4 1.9 69.3 SD 6.7 3.4 0.6 1.3 8.0 Min 4.8 3.7 0.0 0.4 45.1 25th perc 9.5 9.8 0.1 1.2 64.3 Median 13.5 12.2 0.1 1.6 70.4 75th perc 16.7 14.0 0.5 2.3 75.9 Max 34.8 17.6 2.7 7.2 82.8 %TimeBestPrices Mean 65.2 83.2 32.7 66.0 92.0 SD 22.6 13.7 17.4 33.0 6.1 Min 15.0 39.6 8.1 0.1 71.1 25th perc 50.9 80.6 20.8 37.0 89.8 Median 71.9 87.0 28.0 79.6 92.5 75th perc 84.9 91.0 40.7 96.3 96.7 Max 92.2 98.0 82.8 99.3 98.9 %TimeSmallSpread Mean 36.1 54.2 9.4 26.9 76.8 SD 24.0 15.8 14.3 19.9 8.0 Min 1.8 13.2 0.1 0.0 50.4 25th perc 13.4 44.2 1.9 6.4 72.8 Median 37.0 58.1 3.3 27.3 78.5 75th perc 57.1 65.3 10.7 44.9 82.5 Max 83.5 83.8 69.3 67.6 89.8 A. Summary statistics for market share and competitiveness measures (N $$=$$ 52) A B C D E %Market share of volume Mean 14.3 11.8 0.4 1.9 69.3 SD 6.7 3.4 0.6 1.3 8.0 Min 4.8 3.7 0.0 0.4 45.1 25th perc 9.5 9.8 0.1 1.2 64.3 Median 13.5 12.2 0.1 1.6 70.4 75th perc 16.7 14.0 0.5 2.3 75.9 Max 34.8 17.6 2.7 7.2 82.8 %TimeBestPrices Mean 65.2 83.2 32.7 66.0 92.0 SD 22.6 13.7 17.4 33.0 6.1 Min 15.0 39.6 8.1 0.1 71.1 25th perc 50.9 80.6 20.8 37.0 89.8 Median 71.9 87.0 28.0 79.6 92.5 75th perc 84.9 91.0 40.7 96.3 96.7 Max 92.2 98.0 82.8 99.3 98.9 %TimeSmallSpread Mean 36.1 54.2 9.4 26.9 76.8 SD 24.0 15.8 14.3 19.9 8.0 Min 1.8 13.2 0.1 0.0 50.4 25th perc 13.4 44.2 1.9 6.4 72.8 Median 37.0 58.1 3.3 27.3 78.5 75th perc 57.1 65.3 10.7 44.9 82.5 Max 83.5 83.8 69.3 67.6 89.8 B. Coefficients on StockSim$$_{\mathrm{iv}}$$from regressions on competitiveness measures Interval Venue A Venue B Venue C Venue D Venue E %TimeBestPrices I1 0.658 0.220 0.970 4.281 –0.340 (2.15) (0.92) (1.27) (3.20) (–3.64) I2 0.704 0.325 1.474 2.091 –0.199 (2.85) (1.46) (2.17) (1.11) (–2.15) I3 0.661 0.348 1.427 0.781 –0.096 (2.70) (1.76) (3.31) (0.53) (–1.26) %TimeSmallSpread I1 1.218 0.891 0.735 2.472 –0.875 (2.64) (2.61) (1.17) (2.77) (–4.63) I2 1.188 0.891 0.966 1.375 –0.660 (2.77) (2.73) (1.79) (1.39) (–3.37) I3 1.042 0.794 0.818 0.539 –0.438 (2.60) (2.68) (2.48) (0.72) (–2.72) B. Coefficients on StockSim$$_{\mathrm{iv}}$$from regressions on competitiveness measures Interval Venue A Venue B Venue C Venue D Venue E %TimeBestPrices I1 0.658 0.220 0.970 4.281 –0.340 (2.15) (0.92) (1.27) (3.20) (–3.64) I2 0.704 0.325 1.474 2.091 –0.199 (2.85) (1.46) (2.17) (1.11) (–2.15) I3 0.661 0.348 1.427 0.781 –0.096 (2.70) (1.76) (3.31) (0.53) (–1.26) %TimeSmallSpread I1 1.218 0.891 0.735 2.472 –0.875 (2.64) (2.61) (1.17) (2.77) (–4.63) I2 1.188 0.891 0.966 1.375 –0.660 (2.77) (2.73) (1.79) (1.39) (–3.37) I3 1.042 0.794 0.818 0.539 –0.438 (2.60) (2.68) (2.48) (0.72) (–2.72) This table presents summary statistics for measures of the competitiveness (or viability) of trading venues and regression results that relate them to competition between high-frequency trading (HFT) firms in three product categories (or underlying common strategies). The five trading venues we investigate are all organized as electronic limit order books and together execute 97.7% of the volume in Canada during our sample period. We denote the five trading venues in the table by the letters A through E. Panel A provides cross-sectional summary statistics for market share in terms of volume, as well as for the two measures of trading venue viability or competitiveness: (1) %TimeBestPrices, defined as the percentage of time that the trading venue posts either the best bid or the best ask in the market (where the market is defined as the aggregation of all five trading venues), and (2) %TimeSmallSpread, defined as the percentage of time that the bid–ask spread on the trading venue is the narrowest spread in the market. In panel B we examine whether correlated activity of HFT firms on a particular trading venue is helpful for the competitive position of the trading venue by manifesting in better prices and spreads. To ensure that our results could be interpreted in the same manner for all trading venues, we focus on the HFT firms that engage in substantial activity on all five trading venues, which we define as sending at least 10,000 messages to each of the venues. Eight HFT firms satisfy this criterion, and we compute an HFT competition measure, StockSim$$_{\textit{iv}}$$, that provides information on the similarity in strategies of these HFT firms for a given stock on a particular trading venue. We use MSG (the number of messages HFT firms send to the market) to represent the strategies. For each trading venue $$v$$, we run the following cross-sectional regression: \begin{align*} C_{\textit{iv}}&=a_{0} +a_{1} \textit{StockSim}_{\textit{iv}} +a_{2} \textit{Activity}_{\textit{iv}} +a_{3} \textit{Spread}_{iv} +a_{4} \textit{BBOdepth}_{\textit{iv}}\\&\quad +\,a_{5} \textit{MktCap}_{i} +a_{6} \textit{Price}_{i} +a_{7} \textit{Volatility}_{i} + \textit{error}_{\textit{iv}}, \end{align*} where $$C_{\textit{iv}}$$ is one of the two viability measures, and as control variables we include the magnitude of HFT activity on the specific trading venue (Activity$$_{\textit{iv}})$$, spreads and depth (now computed from the best prices on a single trading venue), market capitalization, price level, and the standard deviation of 30-minute returns over the sample period. We report in panel B only the coefficients on StockSim$$_{\textit{iv}}$$ from each regression (with heteroscedasticity-consistent $$t$$-statistics in parentheses) for the five trading venues and the three time intervals (I1, I2, and I3). Table 7 Trading venue competitiveness and the similarity in HFT strategies A. Summary statistics for market share and competitiveness measures (N $$=$$ 52) A B C D E %Market share of volume Mean 14.3 11.8 0.4 1.9 69.3 SD 6.7 3.4 0.6 1.3 8.0 Min 4.8 3.7 0.0 0.4 45.1 25th perc 9.5 9.8 0.1 1.2 64.3 Median 13.5 12.2 0.1 1.6 70.4 75th perc 16.7 14.0 0.5 2.3 75.9 Max 34.8 17.6 2.7 7.2 82.8 %TimeBestPrices Mean 65.2 83.2 32.7 66.0 92.0 SD 22.6 13.7 17.4 33.0 6.1 Min 15.0 39.6 8.1 0.1 71.1 25th perc 50.9 80.6 20.8 37.0 89.8 Median 71.9 87.0 28.0 79.6 92.5 75th perc 84.9 91.0 40.7 96.3 96.7 Max 92.2 98.0 82.8 99.3 98.9 %TimeSmallSpread Mean 36.1 54.2 9.4 26.9 76.8 SD 24.0 15.8 14.3 19.9 8.0 Min 1.8 13.2 0.1 0.0 50.4 25th perc 13.4 44.2 1.9 6.4 72.8 Median 37.0 58.1 3.3 27.3 78.5 75th perc 57.1 65.3 10.7 44.9 82.5 Max 83.5 83.8 69.3 67.6 89.8 A. Summary statistics for market share and competitiveness measures (N $$=$$ 52) A B C D E %Market share of volume Mean 14.3 11.8 0.4 1.9 69.3 SD 6.7 3.4 0.6 1.3 8.0 Min 4.8 3.7 0.0 0.4 45.1 25th perc 9.5 9.8 0.1 1.2 64.3 Median 13.5 12.2 0.1 1.6 70.4 75th perc 16.7 14.0 0.5 2.3 75.9 Max 34.8 17.6 2.7 7.2 82.8 %TimeBestPrices Mean 65.2 83.2 32.7 66.0 92.0 SD 22.6 13.7 17.4 33.0 6.1 Min 15.0 39.6 8.1 0.1 71.1 25th perc 50.9 80.6 20.8 37.0 89.8 Median 71.9 87.0 28.0 79.6 92.5 75th perc 84.9 91.0 40.7 96.3 96.7 Max 92.2 98.0 82.8 99.3 98.9 %TimeSmallSpread Mean 36.1 54.2 9.4 26.9 76.8 SD 24.0 15.8 14.3 19.9 8.0 Min 1.8 13.2 0.1 0.0 50.4 25th perc 13.4 44.2 1.9 6.4 72.8 Median 37.0 58.1 3.3 27.3 78.5 75th perc 57.1 65.3 10.7 44.9 82.5 Max 83.5 83.8 69.3 67.6 89.8 B. Coefficients on StockSim$$_{\mathrm{iv}}$$from regressions on competitiveness measures Interval Venue A Venue B Venue C Venue D Venue E %TimeBestPrices I1 0.658 0.220 0.970 4.281 –0.340 (2.15) (0.92) (1.27) (3.20) (–3.64) I2 0.704 0.325 1.474 2.091 –0.199 (2.85) (1.46) (2.17) (1.11) (–2.15) I3 0.661 0.348 1.427 0.781 –0.096 (2.70) (1.76) (3.31) (0.53) (–1.26) %TimeSmallSpread I1 1.218 0.891 0.735 2.472 –0.875 (2.64) (2.61) (1.17) (2.77) (–4.63) I2 1.188 0.891 0.966 1.375 –0.660 (2.77) (2.73) (1.79) (1.39) (–3.37) I3 1.042 0.794 0.818 0.539 –0.438 (2.60) (2.68) (2.48) (0.72) (–2.72) B. Coefficients on StockSim$$_{\mathrm{iv}}$$from regressions on competitiveness measures Interval Venue A Venue B Venue C Venue D Venue E %TimeBestPrices I1 0.658 0.220 0.970 4.281 –0.340 (2.15) (0.92) (1.27) (3.20) (–3.64) I2 0.704 0.325 1.474 2.091 –0.199 (2.85) (1.46) (2.17) (1.11) (–2.15) I3 0.661 0.348 1.427 0.781 –0.096 (2.70) (1.76) (3.31) (0.53) (–1.26) %TimeSmallSpread I1 1.218 0.891 0.735 2.472 –0.875 (2.64) (2.61) (1.17) (2.77) (–4.63) I2 1.188 0.891 0.966 1.375 –0.660 (2.77) (2.73) (1.79) (1.39) (–3.37) I3 1.042 0.794 0.818 0.539 –0.438 (2.60) (2.68) (2.48) (0.72) (–2.72) This table presents summary statistics for measures of the competitiveness (or viability) of trading venues and regression results that relate them to competition between high-frequency trading (HFT) firms in three product categories (or underlying common strategies). The five trading venues we investigate are all organized as electronic limit order books and together execute 97.7% of the volume in Canada during our sample period. We denote the five trading venues in the table by the letters A through E. Panel A provides cross-sectional summary statistics for market share in terms of volume, as well as for the two measures of trading venue viability or competitiveness: (1) %TimeBestPrices, defined as the percentage of time that the trading venue posts either the best bid or the best ask in the market (where the market is defined as the aggregation of all five trading venues), and (2) %TimeSmallSpread, defined as the percentage of time that the bid–ask spread on the trading venue is the narrowest spread in the market. In panel B we examine whether correlated activity of HFT firms on a particular trading venue is helpful for the competitive position of the trading venue by manifesting in better prices and spreads. To ensure that our results could be interpreted in the same manner for all trading venues, we focus on the HFT firms that engage in substantial activity on all five trading venues, which we define as sending at least 10,000 messages to each of the venues. Eight HFT firms satisfy this criterion, and we compute an HFT competition measure, StockSim$$_{\textit{iv}}$$, that provides information on the similarity in strategies of these HFT firms for a given stock on a particular trading venue. We use MSG (the number of messages HFT firms send to the market) to represent the strategies. For each trading venue $$v$$, we run the following cross-sectional regression: \begin{align*} C_{\textit{iv}}&=a_{0} +a_{1} \textit{StockSim}_{\textit{iv}} +a_{2} \textit{Activity}_{\textit{iv}} +a_{3} \textit{Spread}_{iv} +a_{4} \textit{BBOdepth}_{\textit{iv}}\\&\quad +\,a_{5} \textit{MktCap}_{i} +a_{6} \textit{Price}_{i} +a_{7} \textit{Volatility}_{i} + \textit{error}_{\textit{iv}}, \end{align*} where $$C_{\textit{iv}}$$ is one of the two viability measures, and as control variables we include the magnitude of HFT activity on the specific trading venue (Activity$$_{\textit{iv}})$$, spreads and depth (now computed from the best prices on a single trading venue), market capitalization, price level, and the standard deviation of 30-minute returns over the sample period. We report in panel B only the coefficients on StockSim$$_{\textit{iv}}$$ from each regression (with heteroscedasticity-consistent $$t$$-statistics in parentheses) for the five trading venues and the three time intervals (I1, I2, and I3). To study how HFT competition affects the viability of trading venues, we compute the HFT similarity measures for each venue by considering activity on that venue only. However, many of the HFT firms are active on only some of the trading venues. To ensure that our results could be interpreted in the same manner for all trading venues, we restrict attention in this test to the HFT firms that send at least 10,000 messages to each of the trading venues during our sample period.40 We construct the measure of similarity in the strategies of HFT firms in each stock as before: computing the correlation coefficient between the activities of pairs of HFT firms over all time intervals, and averaging the correlations for the pairs of HFT firms. The two differences from before are that we compute a separate similarity measure for each trading venue using activity on that venue only, and we focus on a set of eight firms that send messages to all trading venues. We use this similarity measure, StockSim$$_{\textit{iv}}$$, to represent competition between HFT firms in stock $$i$$ on trading venue $$v$$. For each trading venue, we run the following cross-sectional regression: \begin{align} C_{\textit{iv}} & = a_{0} +a_{1} \textit{StockSim}_{\textit{iv}} +a_{2} \textit{Activity}_{\textit{iv}} +a_{3} \textit{Spread}_{\textit{iv}} +a_{4} \textit{BBOdepth}_{\textit{iv}} \notag\\ &\quad + \,a_{5} \textit{MktCap}_{i} +a_{6} \mbox{Price}_{i} +a_{7} \textit{Volatility}_{i} + \textit{error}_{\textit{iv}} \,, \end{align} (2) where $$C_{\textit{iv}}$$ stands for one of the two competitiveness measures for stock $$i$$ on trading venue $$v$$.41 As control variables we include the magnitude of HFT activity on the specific trading venue (Activity$$_{\textit{iv}})$$, spreads and depth (now computed from the best prices on a single trading venue), market capitalization, price level, and the standard deviation of 30-minute returns over the sample period. To present results for the five trading venues and the three interval lengths side-by-side in panel B of Table 7, we report only the coefficient on StockSim$$_{\textit{iv}}$$ from each regression (with heteroscedasticity-consistent $$t$$-statistics). We identify an interesting pattern: the coefficients for the smaller trading venues are positive, although not all of them are statistically significant. For the largest trading venue in terms of market share, however, we find the opposite result: the coefficient is negative and statistically significant in almost all regressions. Therefore, the effect of similarity in HFT strategies on the viability or competitiveness of a trading venue depends on the nature of that venue: it benefits smaller venues that introduce competition into the market and detracts from the dominant position of the largest trading venue. The disparity in results between the smaller trading venues and the dominant venue suggests that reverse causality is less likely to be a concern in these regressions. We also believe that the economics of trading on these venues is such that the percentage of time a trading venue posts the best prices is determined by the strategies of the HFT firms, rather than vice versa, because HFT firms are the dominant players in terms of liquidity provision. Our results therefore suggest that one driver of the negative relationship we document between market concentration and HFT competition is that HFT competition enhances the viability of smaller trading venues (in terms of displaying better prices and smaller spreads), which increases their market share. 5. Additional Results and Robustness In this section we provide additional discussion concerning the nature of the variables that represent HFT activity as well as the robustness of the PCA results. The Online Appendix discusses additional robustness analysis that concerns the PCA methodology (e.g., PCA commonalities, additional component scores regression specifications, and the use of alternative methodologies, such as sparse principal component analysis). 5.1 Stylized facts on similarity in the activity of HFT firms In this analysis, we establish two stylized facts about similarity in the activity of HFT firms that are directly relevant to the economics of HFT strategies and hence are relevant to the design of the tests in our paper. First, we consider whether the predominant similarity in HFT strategies involves directional activity or total activity. Second, we examine whether similarity in HFT strategies changes on days when the market experiences large positive or negative returns. We construct two similarity measures for the activities of the 31 HFT firms: market-wide and stock-specific. Our market-wide similarity measure, which is defined and used in Section 4.1, provides information pertaining to whether the strategies of HFT firms are correlated across stocks at a given time. The stock-specific similarity measure, which is defined and used in Section 4.2, provides information pertaining to whether the strategies of HFT firms are correlated over time for a given stock. Our analysis of directional activity is motivated by empirical studies on herding in financial markets (e.g., Wermers 1999; Khandani and Lo 2007, 2011; Choi and Sias 2009; Pedersen 2009; Brown, Wei, and Wermers 2014) that recognize the destabilizing influence that simultaneous actions in one direction (buying or selling) by institutional investors can have on asset prices. In Figure 2, we compare the similarity in total HFT activity (buy plus sell orders) and directional HFT activity (buy minus sell orders).42 We observe a striking result: the similarity measures involving total activity are four to twelve times the magnitude of those involving directional activity. For example, panel A of Figure 2 shows that when we use all messages a firm sends to the market to represent HFT activity, the market-wide similarity measure of total activity for the largest HFT firms (MS1) is 0.359 versus 0.073 for the directional message activity. A similar pattern is observed for the stock-specific similarity measure reported in panel B. The picture that emerges is consistent with HFT firms operating simultaneously on both sides of the market or buying and selling very rapidly within the same one-second interval (rather than pursuing very strong directional trading for long periods of time). Many HFT firms design strategies to interact with uninformed order flow. Market microstructure theory often specifies that uninformed order flow is nondirectional (in contrast to informed order flow), which could potentially explain our finding of low directional similarity of HFT strategies. Given these findings, we focus in the rest of the paper on analyzing similarity in total activity. Figure 2 View largeDownload slide Similarity in HFT strategies: Total versus directional In this figure, we compare measures of similarity in the strategies of high-frequency trading (HFT) firms for total versus directional activity. We use data from the Investment Industry Regulatory Organization of Canada (IIROC) to identify 31 HFT firms. These firms are further categorized into three subgroups according to market share—MS1 (market share $$>$$ 4%), MS2 (market share of between 1% and 4%), and MS3 (the rest). We compare the magnitudes of the market-wide similarity measures in panel A, and the stock-specific similarity measures in panel B for three variables that describe total HFT activity (defined as buy plus sell orders) and directional (“Net”) HFT activity (defined as buy minus sell orders): MSG (all messages HFT firms send to the market), TRD (all their trades), and LMT (submissions and cancellations of nonmarketable limit orders). The market-wide similarity measures indicate whether the strategies of HFT firms are correlated across stocks in a particular time interval. For each one-second time interval, we compute the correlation coefficient between the activities of two HFT firms across the stocks in the sample, and average the correlations for all pairs of firms. The stock-specific similarity measures indicate whether the strategies of HFT firms are correlated over time for a particular stock. For each stock, we compute the correlation coefficient between HFT activities of any two HFT firms, and average these across all pairs of firms. Figure 2 View largeDownload slide Similarity in HFT strategies: Total versus directional In this figure, we compare measures of similarity in the strategies of high-frequency trading (HFT) firms for total versus directional activity. We use data from the Investment Industry Regulatory Organization of Canada (IIROC) to identify 31 HFT firms. These firms are further categorized into three subgroups according to market share—MS1 (market share $$>$$ 4%), MS2 (market share of between 1% and 4%), and MS3 (the rest). We compare the magnitudes of the market-wide similarity measures in panel A, and the stock-specific similarity measures in panel B for three variables that describe total HFT activity (defined as buy plus sell orders) and directional (“Net”) HFT activity (defined as buy minus sell orders): MSG (all messages HFT firms send to the market), TRD (all their trades), and LMT (submissions and cancellations of nonmarketable limit orders). The market-wide similarity measures indicate whether the strategies of HFT firms are correlated across stocks in a particular time interval. For each one-second time interval, we compute the correlation coefficient between the activities of two HFT firms across the stocks in the sample, and average the correlations for all pairs of firms. The stock-specific similarity measures indicate whether the strategies of HFT firms are correlated over time for a particular stock. For each stock, we compute the correlation coefficient between HFT activities of any two HFT firms, and average these across all pairs of firms. Our analysis of similarity in the activity of HFT firms in three distinct market conditions is motivated by the concern that HFT firms react to adverse market conditions (in terms of declining prices) by changing their strategies and hence cause greater fragility. Our study is designed to address this issue because we include in our sample period the 10 worst days in terms of index returns from June 2010 through March 2011. Figure 3 presents the similarity measures computed separately for the down, flat, and up days, alongside the similarity measures for the entire 30-day period. We see, for example, that the market-wide similarity measure for MSG of the largest HFT firms (MS1) are almost identical on down, flat, and up days (0.354, 0.361, and 0.362, respectively). We do not observe that these three distinct market environments result in correspondingly distinct similarity measures. This stylized fact, like the previous one concerning lack of similarity in directional trading, is reassuring in terms of market fragility, although it does not preclude the possibility that similarity in HFT strategies would increase during times of extreme stress, such as the Flash Crash in the U.S. The absence of such extreme episodes in Canadian markets, however, prevents us from examining this possibility empirically. Given these findings, results in the rest of the paper are presented for the entire sample period rather than by market condition.43 Figure 3 View largeDownload slide Similarity in HFT strategies and market conditions This figure compares measures of similarity in the strategies of high-frequency trading (HFT) firms for varying market conditions. We use data from the Investment Industry Regulatory Organization of Canada (IIROC) to identify 31 HFT firms. These firms are further categorized into three subgroups according to market share—MS1 (market share $$>$$ 4%), MS2 (market share of between 1% and 4%), and MS3 (the rest). We rank the daily returns of the S&P/TSX Composite Index from June 2010 through March 2011, and select the 10 worst days (down days), the 10 best days (up days), and the 10 days closest to and centered on zero return (flat days), for a total of 30 days (sample period). We compare the magnitudes of the market-wide similarity measures in panel A, and the stock-specific similarity measures in panel B for down days, flat days, and up days. For each period we examine the similarity in HFT strategies in terms of the number of “messages” (MSG) they send to the market, where messages are defined as submissions and cancellations of nonmarketable limit orders as well as execution of marketable limit orders. The market-wide similarity measures indicate whether the strategies of HFT firms are correlated across stocks in a particular time interval. For each one-second time interval, we compute the correlation coefficient between the activities of two HFT firms across the stocks in the sample, and average the correlations for all pairs of firms. The stock-specific similarity measures indicate whether the strategies of HFT firms are correlated over time for a particular stock. For each stock, we compute the correlation coefficient between HFT activities of any two HFT firms, and average these across all pairs of firms. Figure 3 View largeDownload slide Similarity in HFT strategies and market conditions This figure compares measures of similarity in the strategies of high-frequency trading (HFT) firms for varying market conditions. We use data from the Investment Industry Regulatory Organization of Canada (IIROC) to identify 31 HFT firms. These firms are further categorized into three subgroups according to market share—MS1 (market share $$>$$ 4%), MS2 (market share of between 1% and 4%), and MS3 (the rest). We rank the daily returns of the S&P/TSX Composite Index from June 2010 through March 2011, and select the 10 worst days (down days), the 10 best days (up days), and the 10 days closest to and centered on zero return (flat days), for a total of 30 days (sample period). We compare the magnitudes of the market-wide similarity measures in panel A, and the stock-specific similarity measures in panel B for down days, flat days, and up days. For each period we examine the similarity in HFT strategies in terms of the number of “messages” (MSG) they send to the market, where messages are defined as submissions and cancellations of nonmarketable limit orders as well as execution of marketable limit orders. The market-wide similarity measures indicate whether the strategies of HFT firms are correlated across stocks in a particular time interval. For each one-second time interval, we compute the correlation coefficient between the activities of two HFT firms across the stocks in the sample, and average the correlations for all pairs of firms. The stock-specific similarity measures indicate whether the strategies of HFT firms are correlated over time for a particular stock. For each stock, we compute the correlation coefficient between HFT activities of any two HFT firms, and average these across all pairs of firms. 5.2 HFT activity variables The data we obtain are unique in terms of our ability to identify HFT firms, and we use these data to create three variables that describe their activity (MSG, TRD, and LMT). Since most researchers have access only to publicly available data, an interesting question is whether our analysis can provide guidance on which publicly available measures are best for capturing general HFT activity as well as the specific underlying common strategies we identify.44 Specifically, we look at four measures that can be constructed from publicly available order-level data: Ord2Vol: number of limit-order submissions / (1$$\,{+}\,$$value of trades (in CAD $\$$ 100)). Cancel2Trd: number of limit-order cancellations / (1$$\,{+}\,$$number of trades). CancelRatio: number of limit-order cancellations / (1$$\,{+}\,$$number of limit-order submissions and cancellations). RunsInProcess: time-weighted number of strategic runs in an interval. The first two measures reflect order (or cancellation) intensity normalized by trading. Related measures have been used widely in the literature to describe algorithmic trading and HFT (e.g., Hendershott, Jones, and Menkveld 2011; Hagströmer and Norden 2013; and Boehmer, Fong, and Wu 2015). Unlike the first two measures, CancelRatio does not utilize trade executions but rather focuses on the intensity of cancellations normalized by limit-order submission. This measure is meant to capture the characterization of HFT firms in the SEC (2014) concept release as engaging in “the submission of numerous orders that are cancelled shortly after submission.” The last measure is proposed by Hasbrouck and Saar (2013) and is based on strategic runs, which are linked submissions, cancellations, and executions that are likely to be parts of a dynamic algorithmic strategy and that can be identified with publicly available order-level data. RunsInProcess is constructed for each interval by time weighting the number of strategic runs that are longer than 10 messages. We use our order-level data, without taking advantage of the special user IDs or various other flags, to compute these four publicly available measures for the 52 sample stocks and all 1-second intervals in the sample period. Panel A of Table 8 presents Pearson and Spearman correlations showing that all four measures are positively correlated, with the highest correlations observed between the two measures that have executed trades in the denominator (Ord2Vol and Canc2Trd). Panel B of Table 8 presents the correlations of the publicly available measures with the three representations of HFT activity we use in the paper (MSG, LMT, and TRD) aggregated across the 31 HFT firms and the component scores representing the three underlying common strategies. All four publicly available measures are positively correlated with MSG, our main measure representing the activities of HFT firms, with Pearson correlations ranging from 0.178 to 0.675 and Spearman correlations from 0.424 to 0.906. Canc2Trd (the ratio of limit order cancellation to the number of trades) has the highest correlations with MSG. Ord2Trd is highly correlated with MSG, but is negatively correlated with TRD, likely because it has volume in the denominator. In terms of underlying common strategies, the three ratios (Ord2Vol, Canc2Trd, and CancelRatio) are correlated most highly with the first principal component that we associate with cross-venue arbitrage. RunsInProcess has the highest correlation with the second principal component that we associate with market making. Table 8 Publicly available measures A. Correlations among publicly available measures Ord2Vol Canc2Trd CancelRatio RunsInProcess Pearson Ord2Vol 1 Canc2Trd 0.943 1 CancelRatio 0.238 0.295 1 RunsInProcess 0.135 0.156 0.320 1 Spearman Ord2Vol 1 Canc2Trd 0.842 1 CancelRatio 0.623 0.850 1 RunsInProcess 0.383 0.417 0.353 1 A. Correlations among publicly available measures Ord2Vol Canc2Trd CancelRatio RunsInProcess Pearson Ord2Vol 1 Canc2Trd 0.943 1 CancelRatio 0.238 0.295 1 RunsInProcess 0.135 0.156 0.320 1 Spearman Ord2Vol 1 Canc2Trd 0.842 1 CancelRatio 0.623 0.850 1 RunsInProcess 0.383 0.417 0.353 1 B. Correlations of publicly available measures with HFT variables Ord2Vol Canc2Trd CancelRatio RunsInProcess Pearson MSG 0.606 0.765 0.259 0.178 LMT 0.611 0.769 0.259 0.178 TRD –0.048 0.012 0.127 0.104 PC1 0.385 0.526 0.237 0.138 PC2 0.075 0.130 0.148 0.176 PC3 0.274 0.325 0.069 –0.054 Spearman MSG 0.741 0.906 0.728 0.424 LMT 0.743 0.907 0.728 0.426 TRD –0.113 0.238 0.247 0.138 PC1 0.568 0.691 0.535 0.332 PC2 0.037 0.140 0.124 0.210 PC3 0.237 0.267 0.237 0.022 B. Correlations of publicly available measures with HFT variables Ord2Vol Canc2Trd CancelRatio RunsInProcess Pearson MSG 0.606 0.765 0.259 0.178 LMT 0.611 0.769 0.259 0.178 TRD –0.048 0.012 0.127 0.104 PC1 0.385 0.526 0.237 0.138 PC2 0.075 0.130 0.148 0.176 PC3 0.274 0.325 0.069 –0.054 Spearman MSG 0.741 0.906 0.728 0.424 LMT 0.743 0.907 0.728 0.426 TRD –0.113 0.238 0.247 0.138 PC1 0.568 0.691 0.535 0.332 PC2 0.037 0.140 0.124 0.210 PC3 0.237 0.267 0.237 0.022 C. Spearman correlations of publicly available measures with HFT variables by interval length Ord2Vol Canc2Trd CancelRatio RunsInProcess I1 MSG 0.741 0.906 0.728 0.424 LMT 0.743 0.907 0.728 0.426 TRD –0.113 0.238 0.247 0.138 PC1 0.568 0.691 0.535 0.332 PC2 0.037 0.140 0.124 0.210 PC3 0.237 0.267 0.237 0.022 I2 MSG 0.115 0.731 0.411 0.568 LMT 0.120 0.734 0.412 0.569 TRD –0.544 0.045 0.122 0.344 PC1 0.077 0.519 0.254 0.469 PC2 0.163 0.569 0.330 0.202 PC3 –0.164 0.111 0.074 0.329 I3 MSG 0.192 0.549 0.293 0.623 LMT 0.196 0.553 0.295 0.623 TRD –0.341 –0.053 –0.060 0.555 PC1 0.256 0.549 0.368 0.220 PC2 0.181 0.437 0.207 0.519 PC3 –0.211 –0.046 –0.095 0.432 C. Spearman correlations of publicly available measures with HFT variables by interval length Ord2Vol Canc2Trd CancelRatio RunsInProcess I1 MSG 0.741 0.906 0.728 0.424 LMT 0.743 0.907 0.728 0.426 TRD –0.113 0.238 0.247 0.138 PC1 0.568 0.691 0.535 0.332 PC2 0.037 0.140 0.124 0.210 PC3 0.237 0.267 0.237 0.022 I2 MSG 0.115 0.731 0.411 0.568 LMT 0.120 0.734 0.412 0.569 TRD –0.544 0.045 0.122 0.344 PC1 0.077 0.519 0.254 0.469 PC2 0.163 0.569 0.330 0.202 PC3 –0.164 0.111 0.074 0.329 I3 MSG 0.192 0.549 0.293 0.623 LMT 0.196 0.553 0.295 0.623 TRD –0.341 –0.053 –0.060 0.555 PC1 0.256 0.549 0.368 0.220 PC2 0.181 0.437 0.207 0.519 PC3 –0.211 –0.046 –0.095 0.432 This table presents correlations between variables that we use to represent high-frequency trading (HFT) activity and measures that can be constructed from publicly available order-level data. The data we obtain are unique in terms of our ability to identify HFT firms, and we use these data to create three variables that describe their activity: MSG (all messages HFT firms send to the market), TRD (all their trades), and LMT (submissions and cancellations of nonmarketable limit orders). We also use PCA to identify three underlying common strategies and use the principal component scores to represent their activity (PC1, PC2, and PC3). We consider four measures that can be constructed from publicly available order-level data: Ord2Vol: number of limit order submissions/(1$$\,{+}\,$$value of trades (in CAD $\$$ 100)) Cancel2Trd: number of limit order cancellations/(1$$\,{+}\,$$number of trades) CancelRatio: number of limit order cancellations /(1$$\,{+}\,$$number of limit order submissions and cancellations) RunsInProcess: time-weighted number of strategic runs in an interval (where strategic runs are linked submissions, cancellations, and executions that are likely to be parts of a dynamic algorithmic strategy and that can be identified with publicly available order-level data). We use our order-level data, without taking advantage of the special user IDs or various other flags, to compute these four publicly available measures for the 52 sample stocks and all 1-second intervals in our sample period. Panel A presents the Pearson and Spearman correlations for the four publicly available measures. Panel B presents the correlations of the publicly available measures with the HFT activity variables we use in the paper (MSG, LMT, and TRD) aggregated across the 31 HFT firms and the component scores representing the three underlying common strategies. In panel C we present the Spearman correlations of the publicly available measures with the HFT variables for the 1-, 10-, and 60-second intervals (I1, I2, and I3, respectively). Table 8 Publicly available measures A. Correlations among publicly available measures Ord2Vol Canc2Trd CancelRatio RunsInProcess Pearson Ord2Vol 1 Canc2Trd 0.943 1 CancelRatio 0.238 0.295 1 RunsInProcess 0.135 0.156 0.320 1 Spearman Ord2Vol 1 Canc2Trd 0.842 1 CancelRatio 0.623 0.850 1 RunsInProcess 0.383 0.417 0.353 1 A. Correlations among publicly available measures Ord2Vol Canc2Trd CancelRatio RunsInProcess Pearson Ord2Vol 1 Canc2Trd 0.943 1 CancelRatio 0.238 0.295 1 RunsInProcess 0.135 0.156 0.320 1 Spearman Ord2Vol 1 Canc2Trd 0.842 1 CancelRatio 0.623 0.850 1 RunsInProcess 0.383 0.417 0.353 1 B. Correlations of publicly available measures with HFT variables Ord2Vol Canc2Trd CancelRatio RunsInProcess Pearson MSG 0.606 0.765 0.259 0.178 LMT 0.611 0.769 0.259 0.178 TRD –0.048 0.012 0.127 0.104 PC1 0.385 0.526 0.237 0.138 PC2 0.075 0.130 0.148 0.176 PC3 0.274 0.325 0.069 –0.054 Spearman MSG 0.741 0.906 0.728 0.424 LMT 0.743 0.907 0.728 0.426 TRD –0.113 0.238 0.247 0.138 PC1 0.568 0.691 0.535 0.332 PC2 0.037 0.140 0.124 0.210 PC3 0.237 0.267 0.237 0.022 B. Correlations of publicly available measures with HFT variables Ord2Vol Canc2Trd CancelRatio RunsInProcess Pearson MSG 0.606 0.765 0.259 0.178 LMT 0.611 0.769 0.259 0.178 TRD –0.048 0.012 0.127 0.104 PC1 0.385 0.526 0.237 0.138 PC2 0.075 0.130 0.148 0.176 PC3 0.274 0.325 0.069 –0.054 Spearman MSG 0.741 0.906 0.728 0.424 LMT 0.743 0.907 0.728 0.426 TRD –0.113 0.238 0.247 0.138 PC1 0.568 0.691 0.535 0.332 PC2 0.037 0.140 0.124 0.210 PC3 0.237 0.267 0.237 0.022 C. Spearman correlations of publicly available measures with HFT variables by interval length Ord2Vol Canc2Trd CancelRatio RunsInProcess I1 MSG 0.741 0.906 0.728 0.424 LMT 0.743 0.907 0.728 0.426 TRD –0.113 0.238 0.247 0.138 PC1 0.568 0.691 0.535 0.332 PC2 0.037 0.140 0.124 0.210 PC3 0.237 0.267 0.237 0.022 I2 MSG 0.115 0.731 0.411 0.568 LMT 0.120 0.734 0.412 0.569 TRD –0.544 0.045 0.122 0.344 PC1 0.077 0.519 0.254 0.469 PC2 0.163 0.569 0.330 0.202 PC3 –0.164 0.111 0.074 0.329 I3 MSG 0.192 0.549 0.293 0.623 LMT 0.196 0.553 0.295 0.623 TRD –0.341 –0.053 –0.060 0.555 PC1 0.256 0.549 0.368 0.220 PC2 0.181 0.437 0.207 0.519 PC3 –0.211 –0.046 –0.095 0.432 C. Spearman correlations of publicly available measures with HFT variables by interval length Ord2Vol Canc2Trd CancelRatio RunsInProcess I1 MSG 0.741 0.906 0.728 0.424 LMT 0.743 0.907 0.728 0.426 TRD –0.113 0.238 0.247 0.138 PC1 0.568 0.691 0.535 0.332 PC2 0.037 0.140 0.124 0.210 PC3 0.237 0.267 0.237 0.022 I2 MSG 0.115 0.731 0.411 0.568 LMT 0.120 0.734 0.412 0.569 TRD –0.544 0.045 0.122 0.344 PC1 0.077 0.519 0.254 0.469 PC2 0.163 0.569 0.330 0.202 PC3 –0.164 0.111 0.074 0.329 I3 MSG 0.192 0.549 0.293 0.623 LMT 0.196 0.553 0.295 0.623 TRD –0.341 –0.053 –0.060 0.555 PC1 0.256 0.549 0.368 0.220 PC2 0.181 0.437 0.207 0.519 PC3 –0.211 –0.046 –0.095 0.432 This table presents correlations between variables that we use to represent high-frequency trading (HFT) activity and measures that can be constructed from publicly available order-level data. The data we obtain are unique in terms of our ability to identify HFT firms, and we use these data to create three variables that describe their activity: MSG (all messages HFT firms send to the market), TRD (all their trades), and LMT (submissions and cancellations of nonmarketable limit orders). We also use PCA to identify three underlying common strategies and use the principal component scores to represent their activity (PC1, PC2, and PC3). We consider four measures that can be constructed from publicly available order-level data: Ord2Vol: number of limit order submissions/(1$$\,{+}\,$$value of trades (in CAD $\$$ 100)) Cancel2Trd: number of limit order cancellations/(1$$\,{+}\,$$number of trades) CancelRatio: number of limit order cancellations /(1$$\,{+}\,$$number of limit order submissions and cancellations) RunsInProcess: time-weighted number of strategic runs in an interval (where strategic runs are linked submissions, cancellations, and executions that are likely to be parts of a dynamic algorithmic strategy and that can be identified with publicly available order-level data). We use our order-level data, without taking advantage of the special user IDs or various other flags, to compute these four publicly available measures for the 52 sample stocks and all 1-second intervals in our sample period. Panel A presents the Pearson and Spearman correlations for the four publicly available measures. Panel B presents the correlations of the publicly available measures with the HFT activity variables we use in the paper (MSG, LMT, and TRD) aggregated across the 31 HFT firms and the component scores representing the three underlying common strategies. In panel C we present the Spearman correlations of the publicly available measures with the HFT variables for the 1-, 10-, and 60-second intervals (I1, I2, and I3, respectively). Two points are important to note with respect to these correlations. First, the magnitude of the Pearson correlations is sensitive to the specific functional form used. For example, taking a log transformation of the variables increases the correlation between MSG and RunsInProcess from 0.178 to 0.401, bringing it closer to the Spearman correlation (0.424). Second, which publicly available measure is most highly correlated with our HFT variables seems to depend on the length of the interval. In panel C of Table 8 we present the Spearman correlations for the 1-, 10-, and 60-second intervals (I1, I2, and I3, respectively). As we lengthen the interval, RunsInProcess becomes the most highly correlated measure with the three HFT measures. It is also the only one that remains positively correlated with HFT trading activity and with all three underlying common strategies. Canc2Trd remains highly correlated with MSG, but becomes negatively correlated with HFT trading and with the underlying common strategy we associate with the third principal component. We conclude that RunsInProcess appears to be a reasonable proxy for HFT activity, and becomes an even better proxy as one lengthens the time intervals. The two ratios involving cancellations (normalized by the number of trades or the number of limit orders submitted) also perform rather well, although they are best when used over very short intervals. 5.3 Choice of activity representation for PCA To describe the activity of HFT firms in Section 3, we concentrate on MSG, which comprises all messages an HFT firm sends to the market in an interval. In robustness analysis we find that HFT firms that are identified as following underlying common strategies using one activity variable are also identified as such using others. The identification of firms that follow the three underlying common strategies is identical whether we use MSG or LMT, and using TRD we identify only one fewer firm (depending on the cutoff for the loadings) than we do with MSG. We also examine several additional measures for robustness. For example, when we implement the PCA for the buys component of MSG only (or the sells component), we obtain as an output precisely the same HFT firms following the three underlying common strategies. Similarly, the same HFT firms that we find using the MSG measure also follow the principal components when we analyze just the removal of HFT firms’ liquidity from the book, which may be of interest to market regulators concerned about market stability. We also implement the PCA with depth in the book (up to 10 price levels from the best prices) and find a similar inference except that most of the firms with large loadings on the first principal component when using MSG load on the second principal component in the depth analysis, and vice versa. All these measures that one can use as an input for the PCA are simply different representations of the activity generated by the HFT firms’ strategies, and we find that they lead to a similar inference regarding which HFT firms follow common strategies. We also construct a variable to represent HFT activity that consists only of aggressive orders: marketable orders, immediate-or-cancel (IOC) orders, and fill-or-kill (FoK) orders. It is important to note that these types of orders make up only a small fraction of the total messages of HFT firms. However, there are eight HFT firms for which the ratio of aggressive orders to all orders is higher than 10%, with one firm submitting 54.93% aggressive orders, but these are not the large HFT firms. The PCA results in Section 3 indicate that these eight firms carry out unique strategies that are not related to one another in terms of the patterns of when or in which stocks they submit the orders (they do not follow the underlying common strategies we identify). We find that the first principal component in the PCA of the aggressive orders is strongly positively related to HFT cross-venue activity, similar to our finding in Table 3. This could indicate that cross-venue messages that implement the arbitrage activity of HFT firms consist of both aggressive orders that take liquidity and submissions or cancellations of limit orders that adjust positions in the limit order books of the different trading venues (Foucault, Kozhan, and Tham 2017). Still, our analysis suggests that trying to differentiate strategies based on aggressive orders is less useful than including all messages HFT firms send to the market, perhaps because they reveal only a partial picture of the HFT algorithm. We provide the analysis using aggressive orders in the Online Appendix. 5.4 Event-time analysis HFT strategies respond to events in the market: submissions, cancellations, and executions of orders (see, e.g., Hasbrouck and Saar 2013). A natural concern is whether having many intervals without any events affects our PCA inference about the HFT strategies. We use several modifications of our methodology to assess this concern. The first, and most straightforward, modification is to eliminate all intervals without events. If the only intervals used in the estimation are those with events, the analysis de-facto captures only occurrences that are concurrent with events. The second methodology is to aggregate events into buckets, in the spirit of Easley, López de Prado, and O’Hara (2012). One advantage of this methodology is that the bucket size can be made stock-specific and hence reflect the unique pace of activity in each stock. We use two definitions of events to examine the robustness of our results, one using all messages and the other focusing only on trades. The Online Appendix provides detailed descriptions of the methodologies as well as tables with regression results. We find that our results are very robust to working in event time. In terms of identifying the HFT firms that follow the underlying common strategies, we find that exactly the same 14 HFT firms identified in Table 2 have significant loadings when we eliminate the intervals with no activity. Sixteen HFT firms have significant loadings when we aggregate messages into buckets: the same 14 firms from our regular analysis and two firms that in our regular analysis are just below the cutoff (with 0.33 and 0.32 loadings). Thirteen firms have significant loadings when we aggregate trades into buckets: twelve of the same 14 firms that we identified before and one additional firm that had a 0.33 loading in the regular analysis. The bottom line is that the event-time PCA identifies almost an identical set of firms. Furthermore, we use the principal component scores regressions to investigate whether the principal components that come out of the event-time analysis represent the same underlying common strategies that we identify in the regular PCA. We find that taking out intervals without activity has no impact on the direction and significance of the coefficients in the regressions. Moreover, statistically significant variables that help us interpret the underlying common strategies in the original analysis are also significant and point to the same interpretation in the event-time regressions. This is a very reassuring result: the inference about the underlying common strategies is the same regardless of whether we use regular time intervals or event time. 5.5 Net trading revenues Given that our PCA methodology identifies three underlying common strategies, we examine whether the firms that follow these strategies have higher or lower net trading revenues than firms that deploy more idiosyncratic algorithms.45 We find that firms that follow the first and third principal components have net trading revenues that are significantly higher than the net trading revenues of the firms that do not follow any of the underlying common strategies, and also higher than the firms that follow the second principal component. The higher net trading revenues of firms that follow at least two of the underlying common strategies could explain why we observe multiple firms in each product category: these are strategies with a higher revenue potential, and therefore they elicit more intense competition. The Online Appendix provides a detailed exposition and the results of this analysis. 6. Conclusions In this paper, we examine product market competition in the HFT industry. The difficulty in pursing such a study has to do with the idiosyncrasies of an industry in which the product is often not publicly defined or easily observed by market participants. While regulators can collect data on the actions of HFT firms, attempting to understand the product attributes (i.e., the essence of the strategy that generates the actions) is often difficult if not outright impossible. Against this backdrop, the idea behind our study is to define the main product categories in this industry by looking at whether multiple HFT firms follow related strategies. The empirical tool we use, principal component analysis, enables us to employ detailed data on the actions of individual HFT firms to identify underlying strategies that are common to multiple firms. In and of itself, our finding that there are at least three main product categories is important because it shows that thinking about HFT as a single entity in terms of its impact on the market may afford limited insights. We show that these three underlying common strategies differ from one another not just in terms of types of HFT activity (e.g., cross-venue messages or passive liquidity provision), but also in terms of the market and limit-order book conditions associated with them. These strategies serve different functions and should be regarded separately by both regulators and market participants. While much of the literature in economics regards competition as a good attribute that lowers economic rents and facilitates innovation, there are cases in which such competition can have negative byproducts.46 Since competition between HFT firms manifests itself in more highly correlated actions across stocks, it is quite natural to be concerned that this competition will result in higher short-horizon volatility. We find the opposite result: the short-horizon volatility of stocks loads negatively on competition between HFT firms that follow the underlying common strategy we associate with market making. This negative relationship appears to be driven by the manner in which HFT competition in this particular product category affects both the permanent and temporary components of the price impact of trades. Lastly, the market structure for equity trading is currently fragmented among many trading venues. When brokers search for the best prices in the market to deliver best execution to their customers, the dominant role HFT firms play in liquidity provision and price arbitrage in the fragmented marketplace means that they directly influence the market share of trading venues. We find that greater HFT competition on smaller trading venues is helpful for the competitive position of these venues, but the opposite result is observed for the dominant trading venue in Canada. This is likely one of the drivers behind the negative relationship we document between market concentration and HFT competition. Our findings contribute several important insights to the study of the industrial organization of the HFT industry, but much remains unknown. In particular, there is no evidence pertaining to the process of product development and testing in this industry, a process that has implications for both barriers to entry (or the amount of funding required to develop profitable algorithms) and the stability of the markets (e.g., runaway algorithms that can wreak havoc and cause market disruptions). The importance of specific strategies for the well-functioning of our markets is also not completely understood. We view these as open questions and hope that future studies will be able to deepen our knowledge of this fascinating industry. We thank Matthew Baron, Evangelos Benos, Gregg Berman, James Brugler, Itay Goldstein (the editor), Bjorn Hagstromer, Johan Hombert, Adam Joseph-Clark, Elvis Jarnecic, Boyan Jovanovic, Andrew Karolyi, Albert Menkveld, Sophie Moinas, Dermot Murphy, Maureen O’Hara, Mike Prior, Ioanid Rosu, Mehmet Saglam, Mehrdad Samadi, and Giorgio Valente for helpful comments, as well as conference and seminar participants at the 2nd Dauphine Microstructure Workshop, the 3rd Annual SEC/CFP Conference on Financial Market Regulation, the 13th Annual IDC Conference, the FINRA/Columbia Market Structure Conference, Cornell University, Indiana University, Monash University, the PBC School of Finance at Tsinghua University, Tel Aviv University, the University of Victoria Wellington, and VU Amsterdam. Dan Li thanks the Investment Industry Regulatory Organization of Canada (IIROC) for providing her with the data used in this study during her work at IIROC as part of a research agreement, and Maureen Jensen for her support and many helpful discussions. All conclusions are those of the authors and do not represent the views of IIROC. Part of this research was conducted while Gideon Saar was visiting the School of Management at Tel Aviv University. An earlier version of this work was circulated under the title “Correlated High-Frequency Trading.” Supplementary data can be found on The Review of Financial Studies Web site. Footnotes 1 High-frequency trading therefore does not include agency algorithms that execute orders on behalf of investors; the term “HFT” is used only to denote proprietary trading operations conducted by stand-alone firms or the trading desks of larger financial firms. 2 In constructing measures of similarity in strategies, we use correlations between the activities of HFT firms. Several papers have examined correlations in the context of algorithmic trading or HFT. Chaboud et al. (2014) investigate algorithmic trading on the foreign-exchange EBS platform. Benos et al. (2017) use transactions data from the London Stock Exchange to study ten HFT firms. They use vector autoregressions to show a positive dynamic relationship between the activities of different HFT firms, and construct a measure of concurrent directional trading meant to capture correlated behavior. Anand and Venkataraman (2016) investigate the correlated activity of market makers on the Toronto Stock Exchange, some of which are HFT firms that operate as Electronic Liquidity Providers (ELPs). 3 Brogaard et al. (Forthcoming) identify short intervals with large price movements and show that NASDAQ HFT firms in the aggregate supply rather than demand liquidity during these intervals, hence possibly dampening volatility. Hasbrouck and Saar (2013) also find that aggregate HFT activity appears to lower the intraday volatility of NASDAQ stocks. 4 Alpha became a stock exchange on April 1, 2012. In July 2012, Alpha was acquired by the TMX Group, which also owns TSX. During our sample period, however, Alpha and TSX were independent trading venues. 5 MATCH Now also provides a real-time data feed to IIROC and is included in our data. Liquidnet Canada, another dark pool, executed only a few trades each day during our sample period. As a result, IIROC did not require it to participate in the real-time centralized data feed, instead requiring it to submit trade information manually at the end of the trading day. 6 Data on stock characteristics are obtained from the Summary Information Database of the Canadian Financial Markets Research Centre (CFMRC). 7 The NASDAQ HFT data set was constructed by NASDAQ with knowledge of the trading entities, although NASDAQ released only a general HFT flag rather than a separate flag for each HFT firm. 8 Furthermore, the quantitative variables are computed using data from all trading days in September 2010, while we carry out the empirical work on the 30 down, flat, and up days. This out-of-sample property is meant to ensure that the identification is exogenous to our empirical analysis, even though the quantitative variables are used simply for rough initial screening and our classification procedure utilizes the identities of the entities to manually select the HFT firms. 9 This program offers fee incentives to firms that use proprietary capital and high-frequency electronic trading algorithms to provide liquidity on the exchange. ELPs are either independent proprietary trading firms or proprietary trading desks within large banks or financial firms, and the program requires that they passively trade at least 65% of their volume. 10 According to IIROC (2012), “one user ID may identify a single approved trader (for example a registered market maker or inventory trader), a business stream (for example, orders originated by a Participant’s online discount brokerage system), or a client who accesses the markets directly through a DMA relationship with a Participant.” User IDs are therefore utilized at the complete discretion of trading firms. 11 An IIROC (2014) report examined the feasibility of classifying user IDs using machine learning methods. The authors manually labeled 49 user IDs based on knowledge of the trading entities, which is considered the most accurate classification method, to train their algorithm. The trained algorithm was then used on all user IDs, and ended up classifying 98 user IDs as engaging in HFT. While it is possible that these 98 IDs are the most active ones among the 572 user IDs we identify, these results highlight the difficulty involved in classifying HFT firms using quantitative methods. 12 To ensure the robustness of our results, we also carry out a principal component analysis in terms of user IDs and discuss the results in the Online Appendix. 13 Even estimating net trading revenues requires assumptions about valuing inventory that can dramatically influence the results (see Carion 2013). For completeness, the Online Appendix contains an analysis of net trading revenues (including fees and liquidity rebates) in the context of the underlying common strategies of HFT firms that we identify in this paper. 14 An AMEND order type is considered two messages, a cancellation and a resubmission, for the purpose of our measures. Our measures include both nondisplayed and displayed orders. A refresh of an iceberg order (when the displayed part is executed and shares from the nondisplayed part become displayed) does not lead to a change in price or quantity; hence only the initial order is counted. This standardizes the treatment for the various order types according to their economic functions and enables us to summarize all activity in terms of submissions and cancellations of nonmarketable orders and executions of marketable orders. 15 While the choice of activity representation to use for the PCA could matter, we find that the HFT firms we identify as following underlying common strategies using one representation are also identified as such using others. Section 5 further discusses this issue. 16 Our conclusions are not sensitive to the exact cutoff we use to denote an economically significant loading, and we provide a robustness test in the Online Appendix. 17 All measures are time-weighted. Since messages are time-stamped to 10 milliseconds, there are orders that are submitted and canceled (or executed) within the same time stamp. When we consider orders that stay in the book to provide liquidity (e.g., for our measures of depth), we assume that a submitted order stays in the book for 10 milliseconds. The exceptions are the following special order types: immediate-or-cancel orders (IOC), fill-or-kill orders (FOK), all-or-nothing orders (AON), dealer’s AG orders (that are generated to fulfill their market-making obligations and execute against an incoming order), odd lot orders (OL), and marketable orders, which are assumed to be executed or canceled upon arrival to the market and hence are not added to the limit-order book. 18 XIC tracks approximately 200 large-cap stocks traded on the TSX. The correlation in daily returns between the S&P/TSX60 and XIC is 0.99 for our 30-day sample period. We compute 1-second interval return and volatility (the absolute value of return) for XIC as we do for the sample stocks. 19 Note that the coefficients on spread (%Spread) and volatility (|Return|) are not statistically different from zero. When spreads are wider, market makers’ profit potential may be higher, and as a result they increase their activity. By supplying more liquidity, however, they work to narrow the spread. The contemporaneous relationship we estimate could incorporate both of these contrasting effects, resulting in a coefficient that is indistinguishable from zero. As for volatility, it is well known that market makers profit from transitory volatility (e.g., they buy when prices are pushed down too much and sell when they reverse). At the same time, market makers lose to informed traders (or simply faster traders) when the source of volatility is informational. Our volatility measure combines both transitory and permanent volatility, which could explain why we do not see a strong effect as these two would have opposite implications for the intensity of market makers’ activity. 20 We also carried out the analysis considering competition between all HFT firms in the market (not just in each product category). The insights we obtain are similar. In the Online Appendix, we provide tables with summary statistics for the similarity measures computed from MSG, TRD, and LMT, and the analysis using all HFT firms and multiple interval lengths. 21 If an HFT firm follows two underlying common strategies, we include its activity in the similarity measures of both product categories. This applies to HFT firms 28 and 31 (PC1 and PC2) as well as 17 and 20 (PC1 and PC3). 22 The similarity measures computed from MSG, TRD, or LMT have comparable magnitudes. For example, the time-series means of MktSimPC1, MktSimPC2, and MktSimPC3 are 0.320, 0.325, and 0.196, respectively, when we use MSG to represent HFT activity. These summary statistics for TRD (LMT) are 0.386, 0.342, and 0.151 (0.319, 0.324, and 0.196). 23 Identifying the HFT firms that follow the three underlying common strategies, though, is done based on the principal component analysis of MSG from Section 3. 24 The returns are computed from trade prices (using the trade closest to the end of an interval). 25 The results are similar when we use the equal-weighted return rather than the value-weighted return as a proxy for the market portfolio. 26 The decrease in HFT activity in 2011 and in 2012 was attributed to a decline in volume and volatility in the market (see, e.g., Popper 2012; Schmerken 2013). 27 Sign tests and Wilcoxon tests reject the hypotheses that the mean and median coefficient on MktSimPC2 are equal to zero (with $$p$$-values < .001). We also reject the hypothesis that the coefficients on MktSimPC2 are equal to zero with a joint F-test from a seemingly unrelated regression estimation (SURE). 28 While an increase in MktSimPC2 from 0 to 0.5 is a rather large change, even a one-standard-deviation increase in MktSimPC2 would decrease short-horizon volatility between 4.3% (MSG) and 17.8% (TRD). 29 Out of the 52 stocks, 37 are cross-listed in the United States. We provide a discussion of robustness tests for cross-listed stocks as well as additional regression specifications in the Online Appendix. 30 We thank a referee for suggesting this analysis. 31 One cannot specify $$\tau >$$ T such that a long period is used to obtain the permanent or temporary price impacts for each trade and then average these price impacts for a short aggregation interval because most of the price adjustment would occur outside the aggregation interval. 32 In the empirical implementation we use relative price impact measures, dividing the $\$$CAD price impact by the prevailing midquote, to be consistent with the common usage of these measures in the literature. 33 The regressions presented in Table 5 use 60-second intervals for both the dependent and the independent variables. 34 The Best Price Obligation was replaced by the Order Protection Rule on February 1, 2011. The most significant change induced by the Order Protection Rule was that it relieved the broker-dealers of the obligation to ensure that when an order entered on a marketplace is executed, better-priced orders on other marketplaces are not ignored or traded-through. Instead, this obligation for regular orders is now placed upon the marketplace receiving the order. The Order Protection Rule therefore also mandates that marketable orders are directed to the trading venues that post the best prices. Out of the 30 trading days in our sample period, 29 days fall under the Best Price Obligation regime, and one day falls under the Order Protection Rule. 35 Menkveld (2013) notes that a new trading venue in Europe became viable only when a large HFT firm began trading on it. 36 The HHI is computed as the sum of squared market shares of the trading venues, and with five trading venues it is always between 0.2 (if volume is equally divided among the five trading venues) and 1 (if volume concentrates on one trading venue). When we use 1-second intervals to compute HHI for each stock, the average value for the sample is 0.879, and it declines to 0.778 and 0.629 when we use 10-second and 60-second intervals, respectively. 37 For example, if StockSimPC1i is computed using MSG, we add as a control variable the aggregate number of submissions, cancellations, and executions in stock $$i$$ that the eight HFT firms send to the market. 38 Unlike the market-wide similarity measures used in Section 4.1, the stock-specific similarity measures for the first two principal components are highly correlated (e.g., 0.71 using MSG). We therefore analyze each one separately rather than having all three measures in a single model. 39 We note that two or more trading venues could potentially be at the best bid or ask at the same time (or have the same narrowest spread). We also recognize that activity on one trading venue could affect activity on another trading venue like in the multimarket inventory model of Lescourret and Moinas (2015). 40 There are eight such HFT firms, four of which follow the underlying common strategy identified by the first principal component and four of which follow the underlying common strategy identified by the second principal component. 41 We present the results for MSG (total messages sent by the HFT firm) to economize on the size of the table. Results using TRD and LMT are similar in nature. 42 Since limit-order submissions and cancellations have opposing economic implications, we construct the directional activity measure such that we count the cancellation of a limit buy (sell) order by adding to the number of limit sell (buy) orders. For robustness, we also compute a directional activity measure such that submissions and cancellations of buy (sell) orders are both counted as buy (sell) orders, and the resultant correlations are very close. In particular, the result we present in Figure 2, which indicates that the similarity measures for total activity are much larger in magnitude than the similarity measures for directional activity, holds for both measures of directional activity. 43 We verified that the results of all our tests do not differ materially on down, flat, or up days. We also classified all 10-minute intervals into three categories based on volatility and three categories based on volume and examined whether similarity in HFT strategies differs when we focus on intraday periods distinguished by higher volume or volatility. We could not discern any clear patterns across the categories. 44 We thank a referee for suggesting this analysis. 45 To compute net trading revenues (or what is left when shares are bought and sold by a firm), we incorporate trading venue fees and liquidity rebates and assume that shares left (short positions) at the end of the day are “liquidated” (closed) using either the end-of-day midquote or the closing price (Brogaard, Hendershott, and Riordan 2014). It is important to note that although we can calculate an estimate of net trading revenues, a direct analysis of the economic profits of HFT firms or trading desks cannot be conducted as their operating and other costs are unavailable. 46 For example, product market competition can make it more difficult to infer a manager’s action given a firm’s output (Golan, Parlour, and Rajan 2015), and in general could exacerbate the problem of incentivizing managers (e.g., Scharfstein 1988). References Ait-Sahalia, Y., and Sağlam. M. 2016 . High frequency market making: optima quoting. Working Paper , Princeton University . Aldridge, I. 2013 . High-frequency trading: A practical guide to algorithmic strategies and trading systems , Second Edition . Hoboken, NJ : John Wiley & Sons . Anand, A., and Venkataraman. K. 2016 . Market conditions, fragility, and the economics of market making. Journal of Financial Economics 121 : 327 – 49 . Google Scholar CrossRef Search ADS Baron, M., Brogaard, J. and Kirilenko. A. 2012 . The trading profits of high frequency traders. Working Paper , Princeton University . Google Scholar CrossRef Search ADS Benos, E., Brugler, J. Hjalmarsson, E. and Zikes. F. 2017 . Interactions among high-frequency traders. Journal of Financial and Quantitative Analysis 52 : 1375 – 402 . Google Scholar CrossRef Search ADS Benos, E., and Sagade. S. 2016 . Price discovery and the cross-section of high-frequency trading. Journal of Financial Markets 30 : 54 – 77 . Google Scholar CrossRef Search ADS Besanko, D., Dranove, D. Shanley, M. and Schaefer. S. 2010 . Economics of strategy , Fifth Edition . Hoboken, NJ : John Wiley & Sons . Biais, B., Foucault, T. and Moinas. S. 2015 . Equilibrium fast trading. Journal of Financial Economics 116 : 292 – 313 . Google Scholar CrossRef Search ADS Boehmer, E., Fong, K. Y. and Wu. J. J. 2015 . International evidence on algorithmic trading. Working Paper , Singapore Management University . Google Scholar CrossRef Search ADS Breckenfelder, J. H. 2013 . Competition between high-frequency traders, and market quality. Working Paper , Stockholm School of Economics . Google Scholar CrossRef Search ADS Brogaard, J., and Garriott. C. 2016 . High-frequency trading competition. Working Paper , University of Washington and Bank of Canada . Google Scholar CrossRef Search ADS Brogaard, J., Hendershott, T. and Riordan. R. 2014 . High-frequency trading and price discovery. Review of Financial Studies 27 : 2267 – 306 . Google Scholar CrossRef Search ADS Brogaard, J., Hendershott, T. and Riordan. R. 2016 . Price discovery without trading: Evidence from limit orders. Working Paper , University of Washington . Google Scholar CrossRef Search ADS Brogaard, J., Carrion, A. Moyaert, T. Riordan, R. Shkilko, A. and Sokolov. K. Forthcoming. High-frequency trading and extreme price movements. Journal of Financial Economics. Brown, N. C., Wei, K. D. and Wermers. R. 2014 . Analyst recommendations, mutual fund herding, and overreaction in stock prices. Management Science 60 : 1 – 20 . Google Scholar CrossRef Search ADS Budish, E., Cramton, P. and Shim. J. 2015 . The high-frequency trading arms race: Frequent batch auctions as a market design response. Quarterly Journal of Economics 130 : 1547 – 621 . Google Scholar CrossRef Search ADS Carrion, A. 2013 . Very fast money: High-frequency trading on the NASDAQ. Journal of Financial Markets 16 : 680 – 711 . Google Scholar CrossRef Search ADS Chaboud, A. P., Chiquoine, B. Hjalmarsson, E. and Vega. C. 2014 . Rise of the machines: Algorithmic trading in the foreign exchange market. Journal of Finance 69 : 2045 – 84 . Google Scholar CrossRef Search ADS Choi, N., and Sias. R. W. 2009 . Institutional industry herding. Journal of Financial Economics 94 : 469 – 91 . Google Scholar CrossRef Search ADS Clark-Joseph, A. 2014 . Exploratory trading. Working Paper , University of Illinois at Urbana-Champaign . Comerton-Forde, C., Malinova, K. and Park. A. Forthcoming . Regulating dark trading: Order flow segmentation and market quality. Journal of Financial Economics . Easley, D., López de Prado, M. M. and O’Hara. M. 2012 . The volume clock: Insights into the high-frequency paradigm. Journal of Portfolio Management 39 : 19 – 29 . Google Scholar CrossRef Search ADS Foucault, T., Hombert, J. and Rošu. I. 2016 . News trading and speed. Journal of Finance 71 : 335 – 82 . Google Scholar CrossRef Search ADS Foucault, T., Kozhan, R. and Tham. W. W. 2017 . Toxic arbitrage. Review of Financial Studies 30 : 1053 – 94 . Google Scholar CrossRef Search ADS Golan, L., Parlour, C. A. and Rajan. U. 2015 . Competition, managerial slack, and corporate governance. Review of Corporate Finance Studies 4 : 43 – 68 . Google Scholar CrossRef Search ADS Goldstein, M. A., Kumar, P. and Graves. F. C. 2014 . Computerized and high-frequency trading. Financial Review 49 : 177 – 202 . Google Scholar CrossRef Search ADS Hagströmer, B., and Norden. L. 2013 . The diversity of high-frequency traders. Journal of Financial Markets 16 : 741 – 70 . Google Scholar CrossRef Search ADS Hagströmer, B., Nordén, L. and Zhang. D. 2014 . How aggressive are high-frequency traders? Financial Review 49 : 395 – 419 . Google Scholar CrossRef Search ADS Han, J., Khapko, M. and Kyle. A. S. 2014 . Liquidity with high-frequency market making. Working Paper , Swedish House of Finance . Google Scholar CrossRef Search ADS Hasbrouck, J., and Saar. G. 2013 . Low-latency trading. Journal of Financial Markets 16 : 646 – 679 . Google Scholar CrossRef Search ADS Hendershott, T., Jones, C. M. and Menkveld. A. J. 2011 . Does algorithmic trading improve liquidity? Journal of Finance 66 : 1 – 33 . Google Scholar CrossRef Search ADS Hirschey, N. 2016 . Do high-frequency traders anticipate buying and selling pressure? Working Paper , London Business School . Google Scholar CrossRef Search ADS Ho, T. S., and Stoll. H. R. 1983 . The dynamics of dealer markets under competition. Journal of Finance 38 : 1053 – 74 . Google Scholar CrossRef Search ADS Hoberg, G., and Phillips. G. 2016 . Text-based network industries and endogenous product differentiation. Journal of Political Economy 124 : 1423 – 65 . Google Scholar CrossRef Search ADS Hoffmann, P. 2014 . A dynamic limit order market with fast and slow traders. Journal of Financial Economics 113 : 156 – 69 . Google Scholar CrossRef Search ADS Hotelling, H. 1929 . Stability in competition. Economic Journal 39 : 41 – 57 . Google Scholar CrossRef Search ADS IIROC . 2012 . The HOT study: Phase I and II of IIROC’s study of high frequency trading activity on Canadian equity marketplaces. IIROC Report. http://www.iiroc.ca/Documents/2012/c03dbb44-9032-4c6b-946e-6f2bd6cf4e23_en.pdf IIROC . 2014 . Identifying trading groups: methodology and results. IIROC Report . http://www.iiroc.ca/Documents/2014/d74eb0ab-e494-4c71-acf7-7569655ba5a7_en.pdf Jarnecic, E., and Snape. M. 2014 . The provision of liquidity by high-frequency participants. Financial Review 49 : 371 – 94 . Google Scholar CrossRef Search ADS Jarrow, R. A., and Protter. P. 2012 . A dysfunctional role of high frequency trading in electronic markets. International Journal of Theoretical and Applied Finance 15 : 1 – 15 . Google Scholar CrossRef Search ADS Jones, C. M. 2013 . What do we know about high-frequency trading? Working Paper , Columbia Business School . Google Scholar CrossRef Search ADS Jovanovic, B., and Menkveld. A. J. 2016 . Middlemen in limit order markets. Working Paper , New York University . Google Scholar CrossRef Search ADS Khandani, A., and Lo. A. W. 2007 . What happened to the quants in August 2007? Journal of Investment Management 5 : 29 – 78 . Khandani, A. E., and Lo. A. W. 2011 . What happened to the quants in August 2007? Evidence from factors and transactions data. Journal of Financial Markets 14 : 1 – 46 . Google Scholar CrossRef Search ADS Kirilenko, A., Kyle, A. S. M. Samadi, and Tuzun. T. 2017 . The Flash Crash: High-frequency trading in an electronic market. Journal of Finance 72 : 967 – 98 . Google Scholar CrossRef Search ADS Korajczyk, R. A., and Murphy. D. 2017 . High frequency market making to large institutional trades. Working Paper , Northwestern University . Google Scholar CrossRef Search ADS Lance, C. E. 1988 . Residual centering, exploratory and confirmatory moderator analysis, and decomposition of effects in path models containing interactions. Applied Psychological Measurement 12 : 163 – 75 . Google Scholar CrossRef Search ADS Lescourret, L., and Moinas. S. 2015 . Liquidity supply across multiple trading venues. Working Paper , ESSEC Business School . Google Scholar CrossRef Search ADS Malinova, K., Park, A. and Riordan. R. 2013 . Do retail traders suffer from high frequency traders. Working Paper , University of Toronto . Google Scholar CrossRef Search ADS Menkveld, A. J. 2013 . High frequency trading and the new market makers. Journal of Financial Markets 16 : 712 – 40 . Google Scholar CrossRef Search ADS Menkveld, A. J. 2016 . The economics of high-frequency trading: Taking stock. Annual Review of Financial Economics 8 : 1 – 24 . Google Scholar CrossRef Search ADS Menkveld, A. J., and Zoican. M. A. 2017 . Need for speed? Exchange latency and liquidity. Review of Financial Studies 30 : 1188 – 228 . Google Scholar CrossRef Search ADS O’Hara, M., Saar, G. and Zhong. Z. 2017 . Relative tick size and the trading environment. Working Paper , Cornell University . Pedersen, L. H. 2009 . When everyone runs for the exit. International Journal of Central Banking 5 : 177 – 99 . Popper, N. 2012 . High-speed trading no longer hurtling forward. New York Times , October 14 . Rosu, I. 2016 . Fast and slow informed trading. Working Paper , HEC Paris . Salop, S. C. 1979 . Monopolistic competition with outside goods. Bell Journal of Economics 10 : 141 – 56 . Google Scholar CrossRef Search ADS Scharfstein, D. 1988 . Product-market competition and managerial slack. RAND Journal of Economics 19 : 147 – 55 . Google Scholar CrossRef Search ADS Schmerken, I. 2013 . High frequency trading loses its luster. Wall Street and Technology , April 1 . Securities and Exchange Commission (SEC) . 2014 . Concept Release on Equity Market Structure. Release No. 34-61358. https://www.sec.gov/rules/concept/2010/34-61358.pdf Shy, O. 1995 . Industrial organization: Theory and applications . Cambridge : MIT Press . Subrahmanyam, A., and Zheng. H. 2016 . Limit order placement by high-frequency traders. Borsa Istanbul Review 16 : 185 – 209 . CrossRef Search ADS Tirole, J. 1988 . The theory of industrial organization . Cambridge : MIT Press . Wermers, R. 1999 . Mutual fund herding and the impact on stock prices. Journal of Finance 54 : 581 – 622 . Google Scholar CrossRef Search ADS © The Author(s) 2018. Published by Oxford University Press on behalf of The Society for Financial Studies. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/about_us/legal/notices) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png The Review of Financial Studies Oxford University Press

The Competitive Landscape of High-Frequency Trading Firms