Analyzing predictive performance of linear models on high-frequency currency exchange rates

Analyzing predictive performance of linear models on high-frequency currency exchange rates We generate a large number of predictive models by applying linear kernel SVR to historical currency rates’ bid data for three currency pairs obtained from high-frequency trading. The bid tick data are converted into equally spaced (1 min) data. Differences of price between the previous successive timeframes are used as features to predict the direction of movement of the price in the next timeframe. Different values for the number of training samples, number of features, and the length of the timeframes are used when learning the models. These models are used to conduct simulated currency trading in the year following the one in which the model was learned. Profits (sum of realized differences in best bid prices when order is executed), hit ratios, and number of trades executed using these models are recorded. The experiments indicate that while it is difficult to construct models using only historical data that consistently perform well, there are models that show good performance under certain pre-defined conditions, and that many of these models have an interesting property. Upon examining the parameters of these models, we discover that they have all negative coefficients and a negligibly small intercept, while having positive profits and good hit ratio. This suggests a simple yet effective trading strategy. Finally, we examine the historical data to find corroboration for the pattern suggested by the generated models and present the results. Keywords Support vector regression (SVR) · Machine learning · Currency prediction · High-frequency limit order book 1 Introduction electronic communication networks and trading systems [1–3]. As a result of the widespread acceptance and usage Global financial markets have undergone a technological rev- of the latest electronic systems in global financial markets, olution in the past couple of decades. This has been made the processing time for tasks such as ordering or purchasing possible by the rapid advancements in various technical fields has gone down exponentially as compared to older traditional as well as major developments in the software and hardware markets. Since lower processing time means lower overhead, in use. Many established exchanges have widely adopted the financial markets have a stake in pushing the process- ing time as low as possible. To achieve this, many financial marketplaces have been using high-frequency trading sys- An earlier version of this research and paper was presented at the ACI- IDS 2017 conference at Kanazawa, Japan, in April 2017. The authors tems [4]. These systems keep human intervention (which is are grateful to the organizers of the conference and all the participants time-consuming and thus costly) to a bare minimum, and all and reviewers who provided valuable comments and feedback. the transactions are handled by computer algorithms to keep overhead such as time and cost as low as possible. High- This paper expands the training parameters used in the experiments to a much broader range, performs the experiments for another major frequency trading systems have been playing an increasingly currency pair (GB Pound/US Dollar), and investigates the historical vital role in trading (especially online trading). One major data for the presence of the properties shown by the trained models. form of trading is currency trading or foreign exchange (forex for short). The forex market is certainly the largest, most liq- B Chanakya Serjam uid financial market in the world, dwarfing all other markets c.serjam.z3@keio.jp in size and volume of trading. However, it is also a very Akito Sakurai volatile market. As per a report from the Bank of Interna- sakurai@ae.keio.ac.jp tional Settlements, the results from a recent survey [5]show Graduate School of Science and Technology, Keio University, that trading in foreign exchange markets averaged $5.1 tril- Yokohama 223-8522, Japan 123 124 Vietnam Journal of Computer Science (2018) 5:123–132 lion per day in just a single month (April) of 2016. Although usually become the primary input for any prediction model this is down from an average of $5.3 trillion per day in April regardless of the technique used or the assumptions made. of 2013, it is still a very voluminous market. A variety of techniques have been used for prediction Traders investing in the currency markets are particularly tasks depending on the mathematical foundation or the value interested in predicting the direction of movement of the price of specific model parameters. There has been considerable for the currency pair which they are looking to trade. If the research [8–13] done on applying Artificial Neural Networks price of the currency is about to go up, the trader will want to (ANNs) to forex forecasting. Deng et al. [14] and Deng and take the buy position, so he/she can sell the currency later at Sakurai [15] applied complex hybrid prediction techniques a higher price to turn a profit. If the price of the currency is including Multiple Kernel Learning (MKL) and Genetic about to go down, the trader will want to take the sell position. Algorithms (GA) to currency prediction and achieved good Later, the trader can buy the currency again for a lower price results. Kuo et al. [13] presented a decision support system and turn a profit. Finally, the trader may assume a neutral for stock trading using GA-Based Fuzzy Neural Networks position, i.e., neither buy nor sell. Therefore, the prediction (GFNN) and ANNs. Another technique utilized for currency task of a model trained for currency trading can have three rates and financial timeseries prediction is Support Vector outputs: buy, sell, or do nothing. The advent and widespread Machines [6,7], and it has also been applied successfully for usage of high-frequency trading necessitates development high-frequency trading [16,17]. Studies [18,19]haveshown and analysis of new trading strategies that can capture the that SVM-based models achieved on-par or better perfor- short-term behavior of the markets. It is also very important mance in forecasting of exchange rates or asset prices as to make an effort to understand the structure of the market compared to NN-based models for day trading. under the influence of high-frequency trading. While the techniques mentioned above show good results In this paper, we conduct currency prediction experiments in prediction tasks, it is difficult to interpret the inner work- for Euro/US Dollar, British Pound/US Dollar, and US Dol- ing of the models and how the prediction function generates lar/Japanese Yen currency pairs using support vector machine the predictions. In addition, most of the techniques discussed for regression (SVR) [6,7], and examine the results to bet- above use dynamic training sets (using sliding window tech- ter understand the structure of currency trading in the forex nique) to incorporate the latest data/information for making market. Based on the forecast of the models, we perform sim- a prediction model. We were interested to know whether a ulated trading and record the profits or losses by comparing model trained on a static training set can be used for pre- the predicted price movement with the actual price move- diction tasks far beyond the time horizon for which it is ment. We also examine the coefficient and intercept values supposed to be valid. Therefore, due to combination of fac- and correlate them to the profit/loss and hit ratio metrics. The tors such as SVM techniques having good performance in simulated trading is performed under some assumptions and financial forecasting tasks, the feasibility of linear models defined pre-existing conditions that may not be representative for understanding the prediction making process, and very of the real world but of an ideal scenario. Finally, we examine little research available on using linear kernel SVR on static the historical data for the presence of properties exhibited by training set of historical data (only previous price differences) the models. Some interesting results are presented. in high-frequency trading environment, we were motivated This paper is divided into the following major sections. to perform this research. Section 2 describes the background (previous research) and method of research. Section 3 describes the experimental 2.2 Method of research setup and discusses the process in detail. Section 4 presents the results of the experiments and is used for analysis and The primary aim of our research was to try and establish discussion of the results. Finally, Sect. 5 presents a conclusion whether a linear model trained only on a static training set to the research and this paper. of historical data can have good predictive performance, and if so, to analyze the models to find out about the structure of the market. In our goal of analyzing the financial mod- 2 Background and method of research els which take historical data as input and produce relatively good performance, we planned to focus on the character- 2.1 Previous research istics and structure of the model being generated. Hence, we decided on SVR with linear kernels to be the choice of While currency rates are volatile and prone to fluctuations, technique for generating models, since it would be easier to they have also been shown to be deterministically chaotic analyze a linear model as the parameters would relate to real [8,9]. While this may be due to a number of factors, it is and observable data values. For further detailed reading and generally believed that historical data capture this behavior material on Support Vector Machines (SVM) and SVM for most concretely and effectively. Concurrently, historical data Regression (SVR), please refer to [6,7,16,17,20]. 123 Vietnam Journal of Computer Science (2018) 5:123–132 125 In high-frequency trading, the limit order book is updated The data sets are pre-processed to remove the volume data as every time there is a change in the bid or ask price or in case of well as the ask price data. Then, the tick data are converted other events such as a transaction being executed. These data to equally spaced (1 min) data which are the last tick data are called tick data. The limit order book contains, among in the minute. Therefore, we have data sets that contain the others, the timestamp (year/month/date and h/min/s), the best date and the last price at each minute. The data sets used were (highest) bid price, the bid volume, the best (lowest) ask from 2001 to 2015 and separated by year. Since the model is price, and the ask volume. To study the timeseries properties trained on the training set of the specified size extracted from of the price data, we only worked with the price data and 2 years (3 years in the case of GB Pound/US Dollar, since eliminated the volume data. We also make use of only 1 price we need 3 years of minute data for GB Pound/US Dollar (bid price) rather than both the prices as there is not much data to construct the required training set), and then used for qualitative difference in behaviors between both prices. We prediction on the next year, the data of results for prediction also subjected the tick data to some pre-processing which analysis are from 2003 to 2015. For example, the models that included converting the tick data to equally spaced (1 min) were trained in the year 2001 and 2002 (2001 to 2003 for GB data. Since the tick data are recorded every time there is Pound/US Dollar) were used for prediction in the year 2003 a change in the order book, the data are unequally spaced (2004 for GB Pound/US Dollar); the models trained in 2002 and hence unsuitable for timeseries analysis. We wanted to and 2003 were used for prediction in 2004, and so on. check whether some patterns might emerge which can be learned by training models when the data are equally spaced. 3.1 Parameters for training the models Converting the tick data to uniformly spaced data makes it easier to analyze as a timeseries. • Number of features: The values used for the number of In our experiments, we wanted to analyze whether there features were 1, 2, 3, 4, 5, and 6. Features used in our is a correlation between performance metrics such as profits model are the difference of price between successive peri- or hit ratio and the initial parameters of the model such as ods of time going back n periods from the current time size of the training set, the number of features to be used for (t). For instance, if the number of features is 1, it means prediction, and the length of timeframe (1, 2, 3 min, etc). that the model predicts the next output based on just one Therefore, we trained models for many different values of previous difference of price. Consequently, that model these parameters. The models were trained on 1 year, and will have two parameters (since we are using linear ker- then used for validation on the data from the next year by per- nel SVR), the coefficient and the intercept, and we extract forming simulated trading. This is to establish the predictive those parameters to do a qualitative analysis of the model. value of the models, since validating the models on the same If the number of features is n, the model predicts the next year they were learned would not have yielded any informa- output based on n previous time frames and, therefore, tion about the predictive performance of the models on new the model will have n + 1 parameters. unseen data. Various performance metrics are observed and • Length of timeframes: The lengths of timeframes (in min- used for comparative analysis. Then, we examined the coef- utes) used were 1, 2, 3, 4, 5, 7, 10, 20, 30, 40, 50, 60, and ficients and intercepts of the models generated to look for 70. These values were used to see if there is any cor- some basic learning rule or pattern in the models. Finally, we relation between the length of the timeframes and the analyze the historical data to see if the pattern suggested by performance metrics such as profits or hit ratio obtained. the trained models is valid or not, and why a large number of Although this could be extended to larger timeframes, models exhibit the same property. we believe that it might not be fully reflective of the structure of high-frequency trading, where trading is very fast and timeframes are inherently small. We also consid- 3 Experimental setup ered that, in timeframes greater than 1 min, there may be multiple starting points from which the training set can The currency rates data used in our experiments were begin. Therefore, we generate models for all the possible acquired from ICAP. The experiments were performed on starting points (in minutes) within a timeframe and also three different sets of currency pairs, the Euro/US Dollar data average them. set, the GB Pound/US Dollar and the US Dollar/Japanese • Size of training set: The values used for the number of Yen data set. As previously mentioned, the original data sets training samples are 2000, 3000, 4000, 5000, 6000, 7000, contain the best bid and ask prices as well as the volumes. 8000, 9000, and 10,000. The initial experiments performed with both bid and ask price data Models are generated for all possible combinations of for the sake of completeness revealed that the results using either price data are very similar. these initial parameters. 123 126 Vietnam Journal of Computer Science (2018) 5:123–132 Table 1 Summary statistics for Currency pairs Average no. of price Average no. of Average bid-ask spreads the three currency pairs used in quote updates per transactions (deals) per for minute data our experiments from the year minute minute 2001–2015 EURUSD 15.87 10.68 0.00021 GBPUSD 10.17 0.49 0.00087 USDJPY 13.90 6.73 0.02024 Table 2 Average ratios of Currencypairs 1min 2min 3min 4min 5min 7min 10 min positive changes vs negative changes for all 13 timeframes EURUSD 1.009 1.006 1.007 1.007 1.007 1.009 1.010 used in the experiments (table GBPUSD 1.020 1.013 1.011 1.009 1.008 1.007 1.006 continued below) USDJPY 1.016 1.014 1.013 1.013 1.014 1.016 1.017 Currency pairs 20 min 30 min 40 min 50 min 60 min 70 min EURUSD 1.011 1.013 1.011 1.012 1.011 1.011 GBPUSD 1.008 1.010 1.011 1.012 1.014 1.014 USDJPY 1.021 1.023 1.025 1.024 1.025 1.026 3.2 Performance metrics traded. This was done under the assumption that a small transaction of 1 unit will not change or alter the market prices • Hit ratio: The hit ratio, also known as directional symme- condition substantially and thus the following data set will try, is a measure of how many times the model predicted not be disrupted. No fee is charged for transactions. In the the change correctly. In other words, if the model predicts real world, there is a small fee charged for every transac- upward movement and the actual data used for validation tion, but we have chosen to ignore that to focus solely on the confirm it, then it counts as a hit. timeseries properties of currency trading. • Profits: Profits are obtained as a result of simulated trad- In the simulated trading, a trade is counted when we have ing based on the predictions of our models and are simply a change in the predicted direction of movement of the cur- the sum of the realized differences in best bid prices when rency. Since we are only trading 1 unit, if, for instance, the the orders are executed. If the price at the closing of a prediction of direction is downward movement more than timeframe t is price(t) and the prediction at the closing one times in a row, we do not execute or count those trades. of the timeframe t is pred(t), then profit is given as fol- The summary statistics of the data for our experiments lows: are displayed in Tables 1 and 2. Table 1 provides us metadata about average no. of price quote updates per minute, average no. of transactions (deals) per minute, and average bid-ask Profit = [price(t + 1) − price(t )]× pred(t ). (1) spreads observed for 1-min data. Table 2 shows the ratio of positive changes in best bid price vs. negative changes in For the Euro/US Dollar and the GB Pound/US Dollar the best bid price. Since all the ratios in Table 2 are slightly currency pair, the profits were in US Dollars, and for the larger than 1, it implies that the number of positive changes US Dollar/Japanese Yen currency pair, the profits were in in best bid prices has been slightly higher than the number Japanese Yen. It should be emphasized that the profits cal- of negative changes for all timeframes aggregated from 2001 culated in Eq. (1) are not representative of actual profits. to 2015. The next section discusses the results of the experi- In real-world trading, the concept of spread-crossing is an ments. important and integral part of the profit calculation. Since we are working with only best bid prices, the spread does not factor into the equation. It is also important to point out that the bid-ask spread per trade is larger than the profits 4 Results and analysis obtained per trade in most cases, and hence, profits calcu- lated by Eq. (1) would not be positive if we did take the The results of the experiments consisted of the profits per bid-ask spread into account. year, the hit ratios, and the no. of trades executed over the For simulated trading, we put certain conditions in place. period of a year using those models, as well as the inter- We assume that only 1 unit of the currency pair is being cept and coefficients of the models. Since the models were 123 Vietnam Journal of Computer Science (2018) 5:123–132 127 grouped based on the number of features (1–6) used for the (1–6) for the Euro/US Dollar pair, the GB Pound/US Dollar models, we calculated the average profits and hit ratios with pair, and the US Dollar/Japanese Yen pair respectively. respect to the length of timeframes and the size of training It is interesting to note that, as the length of timeframe set (for each value of no. of features). This gave us four plots increases, the avg. hit ratio increases too irrespective of the for each currency pair and gave insight into the performance no. of features, meaning an increase in the accuracy of trend of the models for different input parameters. prediction. However, at the same time, the profits from sim- ulated trading go down as the length of timeframe increases. This is an interesting result, because normally profit would 4.1 Performance metrics vs. length of timeframes be expected to rise when hit ratio rises and vice versa. One reason for this might be that as the length of the timeframe Figures 1, 2, and 3 below show the performance metrics (avg. increases, the no. of trades executed in our simulated trad- hit ratio and avg. profits per year) as a function of the length ing decreases drastically. Thus, even if the hit ratio is higher, of timeframe for all different values of number of features Fig. 1 Plots for avg. hit ratio (left) and avg. profit per year (right) vs. length of timeframe for the Euro/US Dollar currency pair Fig. 2 Plots for avg. hit ratio (left) and avg. profit per year (right) vs. length of timeframe for the GB Pound/US Dollar currency pair 123 128 Vietnam Journal of Computer Science (2018) 5:123–132 Fig. 3 Plots for avg. hit ratio (left) and avg. profit per year (right) vs. length of timeframe for the US Dollar/Japanese Yen currency pair Fig. 4 Plots for avg. hit ratio (left) and avg. profit per year (right) vs. training set size for the Euro/US Dollar currency pair the number of trades executed might simply not be enough The plots show that there is an increase in both the hit ratio to generate profits comparable to shorter timeframes, which and the profits as the size of the training set increases. This have lower hit ratio but a large number of executed trades, might be because smaller training sets lead to over-fitting, and thus more average profit per year. whereas larger training sets can fine tune the parameters a bit In addition, we can see that fewer number of features better. On average, fewer number of features results in higher results in higher hit ratio but lower profits on average. hit ratio and higher profits; although, in US Dollar/Japanese Yen (Fig. 6), lower profits for fewer number of features are observed. 4.2 Performance metrics vs. training set size Figures 4, 5, and 6 show the performance metrics as a function 4.3 Analyzing trained model parameters of the size of the training set for all different values of number of features for the Euro/US Dollar pair, the GB Pound/US While taking a cursory glance at our results, we noticed that Dollar pair, and the US Dollar/JP Yen pair, respectively. a large number of models generated had similarities in the 123 Vietnam Journal of Computer Science (2018) 5:123–132 129 Fig. 5 Plots for avg. hit ratio (left) and avg. profit per year (right) vs. training set size for the GB Pound/US Dollar currency pair Fig. 6 Plots for avg. hit ratio (left) and avg. profit per year (right) vs. training set size for the US Dollar/Japanese Yen currency pair correlation between the values of the intercept and the coef- • Case 1 (C1): Absolute value of intercept < 0.1, all coeffi- ficients. These models had negligibly small intercept (which cients < −10 (< 0 for US Dollar/Japanese Yen), profits would not influence the predictions) as well as negative coef- > 0, and hit ratio ≥ 60%. ficients (although the number of models like this decreased as • Case 2 (C2): Absolute value of intercept < 0.1, all coef- the no. of features, and thus the no. of coefficients, increased) ficients < −10 (< 0 for US Dollar/Japanese Yen), profits while still giving good hit ratios and profits. We checked for > 0, and hit ratio ≥ 50% and < 60%. the number of models that satisfied the condition of very small intercept, negative coefficients, and positive profit and hit ratio. The results for the Euro/US Dollar pair, the GB The difference between the values of coefficients being checked for Pound/Japanese Yen pair, and the US Dollar/Japanese Yen the US Dollar/Japanese Yen pair as compared to the other currency pair are shown in Figs. 7, 8, and 9 respectively. The cases C1, pairs is due to the difference in tick rate. US Dollar/Japanese Yen count C2, C3, and C4 are described as follows: the smallest tick at the second decimal place. The other two currencies count the smallest tick at the fourth decimal place. 123 130 Vietnam Journal of Computer Science (2018) 5:123–132 Fig. 7 Stacked bar plots representing the percentage of models for cases Fig. 9 Stacked bar plots representing the percentage of models for C1–C4 for different values of no. of features on x-axis (Euro/US Dollar cases C1–C4 for different values of no. of features on x-axis (US Dol- currency pair) lar/Japanese Yen currency pair) Thus, for models trained using linear SVR with a single feature, we can give a simple rule which states that the next prediction will be the opposite of the most recent (previ- ous) movement direction. Concretely, if the previous trend is down, the model will predict up for the next change, and if the previous trend is up, the model will predict down for the next change. Using this simple trading rule, we get profit and good hit ratio in our simulated trading when using a single previous movement in direction of the price. This property is called return reversal. From the bar plots below, we can see that out of all the models with just one feature, a large percentage of models fall into case 1 of having positive prof- its and good hit ratio with negligible intercept and negative coefficients. This includes models from all the different time- frames used when training the models. The positive profits and high hit ratio suggest that the strategy may be viable under certain pre-defined circumstances irrespective of the timeframe used. Fig. 8 Stacked bar plots representing the percentage of models for cases For models with two or more features, while case 1 is still C1–C4 for different values of no. of features on x-axis (GB Pound/US Dollar currency pair) a significant percentage of the total models, it decreases as the number of features increases. Since two or more previ- • Case 3 (C3): Absolute value of intercept < 0.1, all coef- ous difference in prices is being considered, it is possible that ficients < −10 (< 0 for US Dollar/Japanese Yen), profits some of the features are negative, while others are positive. > 0, and hit ratio < 50%. In this case, it is difficult to make a definitive statement about • Case 4 (C4): Rest of the models (where not all coeffi- the presence of return reversal, as the condition of all neg- cients are negative or absolute value of intercept > 0.1, ative coefficients is nullified. However, for n features, if all or profits < 0). n features are the same sign, then we can see the next price The stacked bar plots confirmed our initial observation that a large number of models had negative coefficients and negli- gible intercept values while giving profit and good hit ratio. 123 Vietnam Journal of Computer Science (2018) 5:123–132 131 Table 3 Checking the return 1min 5min 20 min 60min reversal property (in percentages rounded to two decimal places) − 1 + 1 − 1 + 1 − 1 + 1 − 1 + 1 for t = 1, 5, 20, and60min for Euro/US Dollar historical bid − 1 46.97 53.03 47.38 52.62 47.00 53.00 47.03 52.97 price data + 1 52.58 47.42 52.20 47.80 52.36 47.64 52.58 47.42 − 1, − 1 46.89 53.11 45.92 54.08 45.38 54.62 45.56 54.44 + 1, + 1 53.34 46.66 54.03 45.97 53.78 46.22 53.86 46.14 − 1, − 1, − 1 46.18 53.82 45.00 55.00 44.07 55.93 43.95 56.05 + 1, + 1, + 1 54.50 45.50 55.48 44.52 54.89 45.11 55.09 44.91 Table 4 Checking the return 1min 5min 20 min 60min reversal property (in percentages rounded to two decimal places) − 1 + 1 − 1 + 1 − 1 + 1 − 1 + 1 for t = 1, 5, 20, and60min for GB Pound/US Dollar historical − 1 45.90 54.10 46.69 53.31 46.67 53.33 46.51 53.49 bid price data + 1 53.03 46.97 52.88 47.12 52.89 47.11 52.61 47.39 − 1, − 1 46.12 53.88 45.77 54.23 45.90 54.10 45.75 54.25 + 1, + 1 53.34 46.66 54.18 45.82 54.01 45.99 53.74 46.26 − 1, − 1, − 1 45.67 54.33 44.96 55.04 45.16 54.84 45.20 54.80 + 1, + 1, + 1 53.87 46.13 55.05 44.95 54.60 45.40 53.70 46.30 Table 5 Checking the return 1min 5min 20 min 60min reversal property (in percentages rounded to two decimal places) − 1 + 1 − 1 + 1 − 1 + 1 − 1 + 1 for t = 1, 5, 20, and60min for US Dollar/Japanese Yen − 1 46.13 53.87 47.10 52.90 46.91 53.09 46.76 53.24 historical bid price data + 1 53.06 46.94 52.22 47.78 52.23 47.77 51.95 48.05 − 1, −1 45.97 54.03 45.69 54.31 45.69 54.31 45.60 54.40 + 1, + 1 53.86 46.14 53.85 46.15 53.73 46.27 52.71 47.29 − 1, − 1, − 1 45.12 54.88 44.37 55.63 44.60 55.40 44.39 55.61 + 1, + 1, + 1 54.80 45.20 55.44 44.56 54.90 45.10 54.10 45.90 movement will be the opposite sign with a much higher prob- models. In case the change in price at the next step is 0, we ability irrespective of timeframe, as this would satisfy the look for the nearest non-zero value in the future. Only bid models in case 1. data are used, since we also used bid data in training the In the next sub-section, we take a look at the percentages models. of return reversal when all n features are the same sign for Tables 3, 4, and 5 show the number of times (in per- models with two or more features. We do this for different centages) the sign of the next value changes based on the timeframes to see if the condition is still satisfied. previous consecutive opposite signs. The rows show the pre- vious direction of movement of the price up to time t. − 1s 4.4 Checking historical data for occurrence of return represent a negative change in price (the price goes down), reversal whereas + 1 s represent a positive change in price (the price goes up). Concurrently, two or more consecutive − 1sor We check for return reversal using 1, 2, and 3 features over + 1 s represent two or more such consecutive moves in the sample timeframes of t = 1, 5, 20, and 60 min for all three same direction. The columns show the probability of the fol- currency pairs. The reason which we chose to check for return lowing direction of movement of the price for time (t + 1). reversal at those timeframes is because it provides a good The results are very consistent for all three currency pairs and spread from all the timeframes that we used to generate the for all the timeframes checked. One or more than one con- secutive − 1 s is consistently followed by + 1 with a higher This does not mean that the actual movement will be of opposite sign. percentage or probability in all timeframes. Similarly, one or However, the accuracy of predicting the movement of the sign is greater more than one consecutive + 1 s is consistently followed by than 50%, and cannot be described as purely chance. 123 132 Vietnam Journal of Computer Science (2018) 5:123–132 a − 1 with a higher percentage in all timeframes. Thus, the References probability of return reversal is always higher than that of the 1. Cont, R., Stoikov, S., Talreja, R.: A stochastic model for order book trend continuing irrespective of the timeframe or the length dynamics. Oper. Res. 58(3), 549–563 (2010) of the trend checked. This also helps to explain why a large 2. Parlour, C.A., Seppi, D.J.: Limit order market: a survey. Handb. number of models learned with linear kernel SVR, even for Financ. Intermed. Bank. 5, 63–95 (2008) more number of features and for varied timeframes, showed 3. Bank of International Settlements: Triennial Central Bank survey of foreign exchange and derivatives market activity in 2007. http:// properties of return reversal. www.bis.org/publ/rpfxf07t.htm. Accessed 18 May 2018 4. Miller, R.S., Shorter, G.: High frequency trading: overview of recent developments (2016). https://fas.org/sgp/crs/misc/R44443. 5 Conclusion and future work pdf. Accessed 18 May 2018 5. Bank of International Settlements: Triennial Central Bank survey of foreign exchange and OTC derivatives markets in 2016. http:// In this paper, we conducted experiments to examine the per- www.bis.org/publ/rpfx16.htm. Accessed 18 May 2018 formance of currency prediction models trained using linear 6. Cortes, C., Vapnik, V.: Support vector networks. Mach. Learn. kernel SVR on historical bid price data for high-frequency 20(3), 273–297 (1995) 7. Smola, A., Vapnik, V., et al.: Support vector regression machines. currency trading. We created models using various values for Adv. Neural Inf. Process. Syst. 9, 155–161 (1996) input parameters such as the length of training set, number 8. Hall, J.W.: Adaptive selection of U.S. stocks with neural nets. In: of features, and length of timeframe for prediction. We also Deboeck, G.J. (ed.) Trading on the Edge: Neural, Genetic and Fuzzy Systems for Chaotic Financial Markets, pp. 45–65. Wiley, validated the results by performing simulated trading and New York (1994) recording the profits and hit ratio on next year’s data and got 9. Yao, J., Tan, C.L.: A case study on using neural networks to perform good results. On examining the models, we found a simple technical forecasting of forex. Neurocomputing 34, 79–98 (2000) rule that gave good results for models with single features, 10. Zimmerman, H., Neuneier, R., Grothmann, R.: Multi-agent model- ing of multiple FX-markets by neural networks. IEEE Trans. Neural which is to predict opposite of the previous direction. This Netw. 12(4), 735–743 (2001) property is also known as return reversal. For models with 11. Zhang, G., Hu, M.Y.: Neural network forecasting of the British two or more features, consecutive previous movements in Pound/US Dollar exchange rate. OMEGA Int. J. Manag. Sci. 26(4), the same direction will result in a higher probability of the 495–506 (1998) 12. Ni, H., Yin, H.: Exchange rate prediction using hybrid neural next movement being in the opposite direction. Finally, we networks and technical indicators. Neurocomputing 72(13–15), validated these findings by examining the historical data for 2815–2823 (2009) occurrence of return reversal, and showed that the probabil- 13. Kuo, R.J., Chen, C.H., et al.: An intelligent stock trading deci- ity of the price movement changing directions is above the sion support system through integration of genetic algorithm based chance level, and that the property of return reversal holds fuzzy neural network and artificial neural network. Fuzzy Sets Syst. 118(1), 21–45 (2001) true irrespective of the timeframe being used. 14. Deng, S., Sakurai, A., Yoshiyama, K., Mitsubuchi, T.: Hybrid For future work, we plan to study models with more com- method of multiple kernel learning and genetic algorithm for fore- plex features, including technical indicators, and hope to find casting short-term foreign exchange rates. Comput. Econ. 45(1), a trading strategy that incorporates return reversal but has 49–89 (2015) 15. Deng, S., Sakurai, A.: Integrated model of multiple kernel learn- even better performance. We also plan to do further analy- ing and differential evolution for EUR/USD trading. Sci. World J. sis to establish the statistical significance of results obtained 2014(914641), 12 (2014) in this experiment. Finally, we also hope to create models 16. Fletcher, T., Shawe Taylor, J.: Multiple kernel learning with Fisher kernels for high frequency currency prediction. Comput. Econ. which give a better prediction of return reversal based on 42(2), 217–240 (2013) several other features such as technical indicators generated 17. Kercheval, A., Zhang, Y.: Modeling high-frequency limit order from the price data. book dynamics with support vector machines. Quant. Financ. 15(8), 1315–1329 (2015) 18. Tay, F.E.H., Cao, L.: Application of support vector machines in Open Access This article is distributed under the terms of the Creative financial time series forecasting. OMEGA Int. J. Manag. Sci. 29(4), Commons Attribution 4.0 International License (http://creativecomm 309–317 (2001) ons.org/licenses/by/4.0/), which permits unrestricted use, distribution, 19. Kim, K.: Financial time series forecasting using support vector and reproduction in any medium, provided you give appropriate credit machines. Neurocomputing 55(1–2), 307–319 (2003) to the original author(s) and the source, provide a link to the Creative 20. Burges, C.J.C.: A tutorial on support vector machines for pattern Commons license, and indicate if changes were made. recognition. Data Min. Knowl. Discov. 2(2), 121–167 (1998) Publisher’s Note Springer Nature remains neutral with regard to juris- dictional claims in published maps and institutional affiliations. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Vietnam Journal of Computer Science Springer Journals

Analyzing predictive performance of linear models on high-frequency currency exchange rates

Free
10 pages
Loading next page...
 
/lp/springer_journal/analyzing-predictive-performance-of-linear-models-on-high-frequency-HuQ0Ij1W5E
Publisher
Springer Berlin Heidelberg
Copyright
Copyright © 2018 by The Author(s)
Subject
Computer Science; Information Systems and Communication Service; Artificial Intelligence (incl. Robotics); Computer Applications; e-Commerce/e-business; Computer Systems Organization and Communication Networks; Computational Intelligence
ISSN
2196-8888
eISSN
2196-8896
D.O.I.
10.1007/s40595-018-0108-x
Publisher site
See Article on Publisher Site

Abstract

We generate a large number of predictive models by applying linear kernel SVR to historical currency rates’ bid data for three currency pairs obtained from high-frequency trading. The bid tick data are converted into equally spaced (1 min) data. Differences of price between the previous successive timeframes are used as features to predict the direction of movement of the price in the next timeframe. Different values for the number of training samples, number of features, and the length of the timeframes are used when learning the models. These models are used to conduct simulated currency trading in the year following the one in which the model was learned. Profits (sum of realized differences in best bid prices when order is executed), hit ratios, and number of trades executed using these models are recorded. The experiments indicate that while it is difficult to construct models using only historical data that consistently perform well, there are models that show good performance under certain pre-defined conditions, and that many of these models have an interesting property. Upon examining the parameters of these models, we discover that they have all negative coefficients and a negligibly small intercept, while having positive profits and good hit ratio. This suggests a simple yet effective trading strategy. Finally, we examine the historical data to find corroboration for the pattern suggested by the generated models and present the results. Keywords Support vector regression (SVR) · Machine learning · Currency prediction · High-frequency limit order book 1 Introduction electronic communication networks and trading systems [1–3]. As a result of the widespread acceptance and usage Global financial markets have undergone a technological rev- of the latest electronic systems in global financial markets, olution in the past couple of decades. This has been made the processing time for tasks such as ordering or purchasing possible by the rapid advancements in various technical fields has gone down exponentially as compared to older traditional as well as major developments in the software and hardware markets. Since lower processing time means lower overhead, in use. Many established exchanges have widely adopted the financial markets have a stake in pushing the process- ing time as low as possible. To achieve this, many financial marketplaces have been using high-frequency trading sys- An earlier version of this research and paper was presented at the ACI- IDS 2017 conference at Kanazawa, Japan, in April 2017. The authors tems [4]. These systems keep human intervention (which is are grateful to the organizers of the conference and all the participants time-consuming and thus costly) to a bare minimum, and all and reviewers who provided valuable comments and feedback. the transactions are handled by computer algorithms to keep overhead such as time and cost as low as possible. High- This paper expands the training parameters used in the experiments to a much broader range, performs the experiments for another major frequency trading systems have been playing an increasingly currency pair (GB Pound/US Dollar), and investigates the historical vital role in trading (especially online trading). One major data for the presence of the properties shown by the trained models. form of trading is currency trading or foreign exchange (forex for short). The forex market is certainly the largest, most liq- B Chanakya Serjam uid financial market in the world, dwarfing all other markets c.serjam.z3@keio.jp in size and volume of trading. However, it is also a very Akito Sakurai volatile market. As per a report from the Bank of Interna- sakurai@ae.keio.ac.jp tional Settlements, the results from a recent survey [5]show Graduate School of Science and Technology, Keio University, that trading in foreign exchange markets averaged $5.1 tril- Yokohama 223-8522, Japan 123 124 Vietnam Journal of Computer Science (2018) 5:123–132 lion per day in just a single month (April) of 2016. Although usually become the primary input for any prediction model this is down from an average of $5.3 trillion per day in April regardless of the technique used or the assumptions made. of 2013, it is still a very voluminous market. A variety of techniques have been used for prediction Traders investing in the currency markets are particularly tasks depending on the mathematical foundation or the value interested in predicting the direction of movement of the price of specific model parameters. There has been considerable for the currency pair which they are looking to trade. If the research [8–13] done on applying Artificial Neural Networks price of the currency is about to go up, the trader will want to (ANNs) to forex forecasting. Deng et al. [14] and Deng and take the buy position, so he/she can sell the currency later at Sakurai [15] applied complex hybrid prediction techniques a higher price to turn a profit. If the price of the currency is including Multiple Kernel Learning (MKL) and Genetic about to go down, the trader will want to take the sell position. Algorithms (GA) to currency prediction and achieved good Later, the trader can buy the currency again for a lower price results. Kuo et al. [13] presented a decision support system and turn a profit. Finally, the trader may assume a neutral for stock trading using GA-Based Fuzzy Neural Networks position, i.e., neither buy nor sell. Therefore, the prediction (GFNN) and ANNs. Another technique utilized for currency task of a model trained for currency trading can have three rates and financial timeseries prediction is Support Vector outputs: buy, sell, or do nothing. The advent and widespread Machines [6,7], and it has also been applied successfully for usage of high-frequency trading necessitates development high-frequency trading [16,17]. Studies [18,19]haveshown and analysis of new trading strategies that can capture the that SVM-based models achieved on-par or better perfor- short-term behavior of the markets. It is also very important mance in forecasting of exchange rates or asset prices as to make an effort to understand the structure of the market compared to NN-based models for day trading. under the influence of high-frequency trading. While the techniques mentioned above show good results In this paper, we conduct currency prediction experiments in prediction tasks, it is difficult to interpret the inner work- for Euro/US Dollar, British Pound/US Dollar, and US Dol- ing of the models and how the prediction function generates lar/Japanese Yen currency pairs using support vector machine the predictions. In addition, most of the techniques discussed for regression (SVR) [6,7], and examine the results to bet- above use dynamic training sets (using sliding window tech- ter understand the structure of currency trading in the forex nique) to incorporate the latest data/information for making market. Based on the forecast of the models, we perform sim- a prediction model. We were interested to know whether a ulated trading and record the profits or losses by comparing model trained on a static training set can be used for pre- the predicted price movement with the actual price move- diction tasks far beyond the time horizon for which it is ment. We also examine the coefficient and intercept values supposed to be valid. Therefore, due to combination of fac- and correlate them to the profit/loss and hit ratio metrics. The tors such as SVM techniques having good performance in simulated trading is performed under some assumptions and financial forecasting tasks, the feasibility of linear models defined pre-existing conditions that may not be representative for understanding the prediction making process, and very of the real world but of an ideal scenario. Finally, we examine little research available on using linear kernel SVR on static the historical data for the presence of properties exhibited by training set of historical data (only previous price differences) the models. Some interesting results are presented. in high-frequency trading environment, we were motivated This paper is divided into the following major sections. to perform this research. Section 2 describes the background (previous research) and method of research. Section 3 describes the experimental 2.2 Method of research setup and discusses the process in detail. Section 4 presents the results of the experiments and is used for analysis and The primary aim of our research was to try and establish discussion of the results. Finally, Sect. 5 presents a conclusion whether a linear model trained only on a static training set to the research and this paper. of historical data can have good predictive performance, and if so, to analyze the models to find out about the structure of the market. In our goal of analyzing the financial mod- 2 Background and method of research els which take historical data as input and produce relatively good performance, we planned to focus on the character- 2.1 Previous research istics and structure of the model being generated. Hence, we decided on SVR with linear kernels to be the choice of While currency rates are volatile and prone to fluctuations, technique for generating models, since it would be easier to they have also been shown to be deterministically chaotic analyze a linear model as the parameters would relate to real [8,9]. While this may be due to a number of factors, it is and observable data values. For further detailed reading and generally believed that historical data capture this behavior material on Support Vector Machines (SVM) and SVM for most concretely and effectively. Concurrently, historical data Regression (SVR), please refer to [6,7,16,17,20]. 123 Vietnam Journal of Computer Science (2018) 5:123–132 125 In high-frequency trading, the limit order book is updated The data sets are pre-processed to remove the volume data as every time there is a change in the bid or ask price or in case of well as the ask price data. Then, the tick data are converted other events such as a transaction being executed. These data to equally spaced (1 min) data which are the last tick data are called tick data. The limit order book contains, among in the minute. Therefore, we have data sets that contain the others, the timestamp (year/month/date and h/min/s), the best date and the last price at each minute. The data sets used were (highest) bid price, the bid volume, the best (lowest) ask from 2001 to 2015 and separated by year. Since the model is price, and the ask volume. To study the timeseries properties trained on the training set of the specified size extracted from of the price data, we only worked with the price data and 2 years (3 years in the case of GB Pound/US Dollar, since eliminated the volume data. We also make use of only 1 price we need 3 years of minute data for GB Pound/US Dollar (bid price) rather than both the prices as there is not much data to construct the required training set), and then used for qualitative difference in behaviors between both prices. We prediction on the next year, the data of results for prediction also subjected the tick data to some pre-processing which analysis are from 2003 to 2015. For example, the models that included converting the tick data to equally spaced (1 min) were trained in the year 2001 and 2002 (2001 to 2003 for GB data. Since the tick data are recorded every time there is Pound/US Dollar) were used for prediction in the year 2003 a change in the order book, the data are unequally spaced (2004 for GB Pound/US Dollar); the models trained in 2002 and hence unsuitable for timeseries analysis. We wanted to and 2003 were used for prediction in 2004, and so on. check whether some patterns might emerge which can be learned by training models when the data are equally spaced. 3.1 Parameters for training the models Converting the tick data to uniformly spaced data makes it easier to analyze as a timeseries. • Number of features: The values used for the number of In our experiments, we wanted to analyze whether there features were 1, 2, 3, 4, 5, and 6. Features used in our is a correlation between performance metrics such as profits model are the difference of price between successive peri- or hit ratio and the initial parameters of the model such as ods of time going back n periods from the current time size of the training set, the number of features to be used for (t). For instance, if the number of features is 1, it means prediction, and the length of timeframe (1, 2, 3 min, etc). that the model predicts the next output based on just one Therefore, we trained models for many different values of previous difference of price. Consequently, that model these parameters. The models were trained on 1 year, and will have two parameters (since we are using linear ker- then used for validation on the data from the next year by per- nel SVR), the coefficient and the intercept, and we extract forming simulated trading. This is to establish the predictive those parameters to do a qualitative analysis of the model. value of the models, since validating the models on the same If the number of features is n, the model predicts the next year they were learned would not have yielded any informa- output based on n previous time frames and, therefore, tion about the predictive performance of the models on new the model will have n + 1 parameters. unseen data. Various performance metrics are observed and • Length of timeframes: The lengths of timeframes (in min- used for comparative analysis. Then, we examined the coef- utes) used were 1, 2, 3, 4, 5, 7, 10, 20, 30, 40, 50, 60, and ficients and intercepts of the models generated to look for 70. These values were used to see if there is any cor- some basic learning rule or pattern in the models. Finally, we relation between the length of the timeframes and the analyze the historical data to see if the pattern suggested by performance metrics such as profits or hit ratio obtained. the trained models is valid or not, and why a large number of Although this could be extended to larger timeframes, models exhibit the same property. we believe that it might not be fully reflective of the structure of high-frequency trading, where trading is very fast and timeframes are inherently small. We also consid- 3 Experimental setup ered that, in timeframes greater than 1 min, there may be multiple starting points from which the training set can The currency rates data used in our experiments were begin. Therefore, we generate models for all the possible acquired from ICAP. The experiments were performed on starting points (in minutes) within a timeframe and also three different sets of currency pairs, the Euro/US Dollar data average them. set, the GB Pound/US Dollar and the US Dollar/Japanese • Size of training set: The values used for the number of Yen data set. As previously mentioned, the original data sets training samples are 2000, 3000, 4000, 5000, 6000, 7000, contain the best bid and ask prices as well as the volumes. 8000, 9000, and 10,000. The initial experiments performed with both bid and ask price data Models are generated for all possible combinations of for the sake of completeness revealed that the results using either price data are very similar. these initial parameters. 123 126 Vietnam Journal of Computer Science (2018) 5:123–132 Table 1 Summary statistics for Currency pairs Average no. of price Average no. of Average bid-ask spreads the three currency pairs used in quote updates per transactions (deals) per for minute data our experiments from the year minute minute 2001–2015 EURUSD 15.87 10.68 0.00021 GBPUSD 10.17 0.49 0.00087 USDJPY 13.90 6.73 0.02024 Table 2 Average ratios of Currencypairs 1min 2min 3min 4min 5min 7min 10 min positive changes vs negative changes for all 13 timeframes EURUSD 1.009 1.006 1.007 1.007 1.007 1.009 1.010 used in the experiments (table GBPUSD 1.020 1.013 1.011 1.009 1.008 1.007 1.006 continued below) USDJPY 1.016 1.014 1.013 1.013 1.014 1.016 1.017 Currency pairs 20 min 30 min 40 min 50 min 60 min 70 min EURUSD 1.011 1.013 1.011 1.012 1.011 1.011 GBPUSD 1.008 1.010 1.011 1.012 1.014 1.014 USDJPY 1.021 1.023 1.025 1.024 1.025 1.026 3.2 Performance metrics traded. This was done under the assumption that a small transaction of 1 unit will not change or alter the market prices • Hit ratio: The hit ratio, also known as directional symme- condition substantially and thus the following data set will try, is a measure of how many times the model predicted not be disrupted. No fee is charged for transactions. In the the change correctly. In other words, if the model predicts real world, there is a small fee charged for every transac- upward movement and the actual data used for validation tion, but we have chosen to ignore that to focus solely on the confirm it, then it counts as a hit. timeseries properties of currency trading. • Profits: Profits are obtained as a result of simulated trad- In the simulated trading, a trade is counted when we have ing based on the predictions of our models and are simply a change in the predicted direction of movement of the cur- the sum of the realized differences in best bid prices when rency. Since we are only trading 1 unit, if, for instance, the the orders are executed. If the price at the closing of a prediction of direction is downward movement more than timeframe t is price(t) and the prediction at the closing one times in a row, we do not execute or count those trades. of the timeframe t is pred(t), then profit is given as fol- The summary statistics of the data for our experiments lows: are displayed in Tables 1 and 2. Table 1 provides us metadata about average no. of price quote updates per minute, average no. of transactions (deals) per minute, and average bid-ask Profit = [price(t + 1) − price(t )]× pred(t ). (1) spreads observed for 1-min data. Table 2 shows the ratio of positive changes in best bid price vs. negative changes in For the Euro/US Dollar and the GB Pound/US Dollar the best bid price. Since all the ratios in Table 2 are slightly currency pair, the profits were in US Dollars, and for the larger than 1, it implies that the number of positive changes US Dollar/Japanese Yen currency pair, the profits were in in best bid prices has been slightly higher than the number Japanese Yen. It should be emphasized that the profits cal- of negative changes for all timeframes aggregated from 2001 culated in Eq. (1) are not representative of actual profits. to 2015. The next section discusses the results of the experi- In real-world trading, the concept of spread-crossing is an ments. important and integral part of the profit calculation. Since we are working with only best bid prices, the spread does not factor into the equation. It is also important to point out that the bid-ask spread per trade is larger than the profits 4 Results and analysis obtained per trade in most cases, and hence, profits calcu- lated by Eq. (1) would not be positive if we did take the The results of the experiments consisted of the profits per bid-ask spread into account. year, the hit ratios, and the no. of trades executed over the For simulated trading, we put certain conditions in place. period of a year using those models, as well as the inter- We assume that only 1 unit of the currency pair is being cept and coefficients of the models. Since the models were 123 Vietnam Journal of Computer Science (2018) 5:123–132 127 grouped based on the number of features (1–6) used for the (1–6) for the Euro/US Dollar pair, the GB Pound/US Dollar models, we calculated the average profits and hit ratios with pair, and the US Dollar/Japanese Yen pair respectively. respect to the length of timeframes and the size of training It is interesting to note that, as the length of timeframe set (for each value of no. of features). This gave us four plots increases, the avg. hit ratio increases too irrespective of the for each currency pair and gave insight into the performance no. of features, meaning an increase in the accuracy of trend of the models for different input parameters. prediction. However, at the same time, the profits from sim- ulated trading go down as the length of timeframe increases. This is an interesting result, because normally profit would 4.1 Performance metrics vs. length of timeframes be expected to rise when hit ratio rises and vice versa. One reason for this might be that as the length of the timeframe Figures 1, 2, and 3 below show the performance metrics (avg. increases, the no. of trades executed in our simulated trad- hit ratio and avg. profits per year) as a function of the length ing decreases drastically. Thus, even if the hit ratio is higher, of timeframe for all different values of number of features Fig. 1 Plots for avg. hit ratio (left) and avg. profit per year (right) vs. length of timeframe for the Euro/US Dollar currency pair Fig. 2 Plots for avg. hit ratio (left) and avg. profit per year (right) vs. length of timeframe for the GB Pound/US Dollar currency pair 123 128 Vietnam Journal of Computer Science (2018) 5:123–132 Fig. 3 Plots for avg. hit ratio (left) and avg. profit per year (right) vs. length of timeframe for the US Dollar/Japanese Yen currency pair Fig. 4 Plots for avg. hit ratio (left) and avg. profit per year (right) vs. training set size for the Euro/US Dollar currency pair the number of trades executed might simply not be enough The plots show that there is an increase in both the hit ratio to generate profits comparable to shorter timeframes, which and the profits as the size of the training set increases. This have lower hit ratio but a large number of executed trades, might be because smaller training sets lead to over-fitting, and thus more average profit per year. whereas larger training sets can fine tune the parameters a bit In addition, we can see that fewer number of features better. On average, fewer number of features results in higher results in higher hit ratio but lower profits on average. hit ratio and higher profits; although, in US Dollar/Japanese Yen (Fig. 6), lower profits for fewer number of features are observed. 4.2 Performance metrics vs. training set size Figures 4, 5, and 6 show the performance metrics as a function 4.3 Analyzing trained model parameters of the size of the training set for all different values of number of features for the Euro/US Dollar pair, the GB Pound/US While taking a cursory glance at our results, we noticed that Dollar pair, and the US Dollar/JP Yen pair, respectively. a large number of models generated had similarities in the 123 Vietnam Journal of Computer Science (2018) 5:123–132 129 Fig. 5 Plots for avg. hit ratio (left) and avg. profit per year (right) vs. training set size for the GB Pound/US Dollar currency pair Fig. 6 Plots for avg. hit ratio (left) and avg. profit per year (right) vs. training set size for the US Dollar/Japanese Yen currency pair correlation between the values of the intercept and the coef- • Case 1 (C1): Absolute value of intercept < 0.1, all coeffi- ficients. These models had negligibly small intercept (which cients < −10 (< 0 for US Dollar/Japanese Yen), profits would not influence the predictions) as well as negative coef- > 0, and hit ratio ≥ 60%. ficients (although the number of models like this decreased as • Case 2 (C2): Absolute value of intercept < 0.1, all coef- the no. of features, and thus the no. of coefficients, increased) ficients < −10 (< 0 for US Dollar/Japanese Yen), profits while still giving good hit ratios and profits. We checked for > 0, and hit ratio ≥ 50% and < 60%. the number of models that satisfied the condition of very small intercept, negative coefficients, and positive profit and hit ratio. The results for the Euro/US Dollar pair, the GB The difference between the values of coefficients being checked for Pound/Japanese Yen pair, and the US Dollar/Japanese Yen the US Dollar/Japanese Yen pair as compared to the other currency pair are shown in Figs. 7, 8, and 9 respectively. The cases C1, pairs is due to the difference in tick rate. US Dollar/Japanese Yen count C2, C3, and C4 are described as follows: the smallest tick at the second decimal place. The other two currencies count the smallest tick at the fourth decimal place. 123 130 Vietnam Journal of Computer Science (2018) 5:123–132 Fig. 7 Stacked bar plots representing the percentage of models for cases Fig. 9 Stacked bar plots representing the percentage of models for C1–C4 for different values of no. of features on x-axis (Euro/US Dollar cases C1–C4 for different values of no. of features on x-axis (US Dol- currency pair) lar/Japanese Yen currency pair) Thus, for models trained using linear SVR with a single feature, we can give a simple rule which states that the next prediction will be the opposite of the most recent (previ- ous) movement direction. Concretely, if the previous trend is down, the model will predict up for the next change, and if the previous trend is up, the model will predict down for the next change. Using this simple trading rule, we get profit and good hit ratio in our simulated trading when using a single previous movement in direction of the price. This property is called return reversal. From the bar plots below, we can see that out of all the models with just one feature, a large percentage of models fall into case 1 of having positive prof- its and good hit ratio with negligible intercept and negative coefficients. This includes models from all the different time- frames used when training the models. The positive profits and high hit ratio suggest that the strategy may be viable under certain pre-defined circumstances irrespective of the timeframe used. Fig. 8 Stacked bar plots representing the percentage of models for cases For models with two or more features, while case 1 is still C1–C4 for different values of no. of features on x-axis (GB Pound/US Dollar currency pair) a significant percentage of the total models, it decreases as the number of features increases. Since two or more previ- • Case 3 (C3): Absolute value of intercept < 0.1, all coef- ous difference in prices is being considered, it is possible that ficients < −10 (< 0 for US Dollar/Japanese Yen), profits some of the features are negative, while others are positive. > 0, and hit ratio < 50%. In this case, it is difficult to make a definitive statement about • Case 4 (C4): Rest of the models (where not all coeffi- the presence of return reversal, as the condition of all neg- cients are negative or absolute value of intercept > 0.1, ative coefficients is nullified. However, for n features, if all or profits < 0). n features are the same sign, then we can see the next price The stacked bar plots confirmed our initial observation that a large number of models had negative coefficients and negli- gible intercept values while giving profit and good hit ratio. 123 Vietnam Journal of Computer Science (2018) 5:123–132 131 Table 3 Checking the return 1min 5min 20 min 60min reversal property (in percentages rounded to two decimal places) − 1 + 1 − 1 + 1 − 1 + 1 − 1 + 1 for t = 1, 5, 20, and60min for Euro/US Dollar historical bid − 1 46.97 53.03 47.38 52.62 47.00 53.00 47.03 52.97 price data + 1 52.58 47.42 52.20 47.80 52.36 47.64 52.58 47.42 − 1, − 1 46.89 53.11 45.92 54.08 45.38 54.62 45.56 54.44 + 1, + 1 53.34 46.66 54.03 45.97 53.78 46.22 53.86 46.14 − 1, − 1, − 1 46.18 53.82 45.00 55.00 44.07 55.93 43.95 56.05 + 1, + 1, + 1 54.50 45.50 55.48 44.52 54.89 45.11 55.09 44.91 Table 4 Checking the return 1min 5min 20 min 60min reversal property (in percentages rounded to two decimal places) − 1 + 1 − 1 + 1 − 1 + 1 − 1 + 1 for t = 1, 5, 20, and60min for GB Pound/US Dollar historical − 1 45.90 54.10 46.69 53.31 46.67 53.33 46.51 53.49 bid price data + 1 53.03 46.97 52.88 47.12 52.89 47.11 52.61 47.39 − 1, − 1 46.12 53.88 45.77 54.23 45.90 54.10 45.75 54.25 + 1, + 1 53.34 46.66 54.18 45.82 54.01 45.99 53.74 46.26 − 1, − 1, − 1 45.67 54.33 44.96 55.04 45.16 54.84 45.20 54.80 + 1, + 1, + 1 53.87 46.13 55.05 44.95 54.60 45.40 53.70 46.30 Table 5 Checking the return 1min 5min 20 min 60min reversal property (in percentages rounded to two decimal places) − 1 + 1 − 1 + 1 − 1 + 1 − 1 + 1 for t = 1, 5, 20, and60min for US Dollar/Japanese Yen − 1 46.13 53.87 47.10 52.90 46.91 53.09 46.76 53.24 historical bid price data + 1 53.06 46.94 52.22 47.78 52.23 47.77 51.95 48.05 − 1, −1 45.97 54.03 45.69 54.31 45.69 54.31 45.60 54.40 + 1, + 1 53.86 46.14 53.85 46.15 53.73 46.27 52.71 47.29 − 1, − 1, − 1 45.12 54.88 44.37 55.63 44.60 55.40 44.39 55.61 + 1, + 1, + 1 54.80 45.20 55.44 44.56 54.90 45.10 54.10 45.90 movement will be the opposite sign with a much higher prob- models. In case the change in price at the next step is 0, we ability irrespective of timeframe, as this would satisfy the look for the nearest non-zero value in the future. Only bid models in case 1. data are used, since we also used bid data in training the In the next sub-section, we take a look at the percentages models. of return reversal when all n features are the same sign for Tables 3, 4, and 5 show the number of times (in per- models with two or more features. We do this for different centages) the sign of the next value changes based on the timeframes to see if the condition is still satisfied. previous consecutive opposite signs. The rows show the pre- vious direction of movement of the price up to time t. − 1s 4.4 Checking historical data for occurrence of return represent a negative change in price (the price goes down), reversal whereas + 1 s represent a positive change in price (the price goes up). Concurrently, two or more consecutive − 1sor We check for return reversal using 1, 2, and 3 features over + 1 s represent two or more such consecutive moves in the sample timeframes of t = 1, 5, 20, and 60 min for all three same direction. The columns show the probability of the fol- currency pairs. The reason which we chose to check for return lowing direction of movement of the price for time (t + 1). reversal at those timeframes is because it provides a good The results are very consistent for all three currency pairs and spread from all the timeframes that we used to generate the for all the timeframes checked. One or more than one con- secutive − 1 s is consistently followed by + 1 with a higher This does not mean that the actual movement will be of opposite sign. percentage or probability in all timeframes. Similarly, one or However, the accuracy of predicting the movement of the sign is greater more than one consecutive + 1 s is consistently followed by than 50%, and cannot be described as purely chance. 123 132 Vietnam Journal of Computer Science (2018) 5:123–132 a − 1 with a higher percentage in all timeframes. Thus, the References probability of return reversal is always higher than that of the 1. Cont, R., Stoikov, S., Talreja, R.: A stochastic model for order book trend continuing irrespective of the timeframe or the length dynamics. Oper. Res. 58(3), 549–563 (2010) of the trend checked. This also helps to explain why a large 2. Parlour, C.A., Seppi, D.J.: Limit order market: a survey. Handb. number of models learned with linear kernel SVR, even for Financ. Intermed. Bank. 5, 63–95 (2008) more number of features and for varied timeframes, showed 3. Bank of International Settlements: Triennial Central Bank survey of foreign exchange and derivatives market activity in 2007. http:// properties of return reversal. www.bis.org/publ/rpfxf07t.htm. Accessed 18 May 2018 4. Miller, R.S., Shorter, G.: High frequency trading: overview of recent developments (2016). https://fas.org/sgp/crs/misc/R44443. 5 Conclusion and future work pdf. Accessed 18 May 2018 5. Bank of International Settlements: Triennial Central Bank survey of foreign exchange and OTC derivatives markets in 2016. http:// In this paper, we conducted experiments to examine the per- www.bis.org/publ/rpfx16.htm. Accessed 18 May 2018 formance of currency prediction models trained using linear 6. Cortes, C., Vapnik, V.: Support vector networks. Mach. Learn. kernel SVR on historical bid price data for high-frequency 20(3), 273–297 (1995) 7. Smola, A., Vapnik, V., et al.: Support vector regression machines. currency trading. We created models using various values for Adv. Neural Inf. Process. Syst. 9, 155–161 (1996) input parameters such as the length of training set, number 8. Hall, J.W.: Adaptive selection of U.S. stocks with neural nets. In: of features, and length of timeframe for prediction. We also Deboeck, G.J. (ed.) Trading on the Edge: Neural, Genetic and Fuzzy Systems for Chaotic Financial Markets, pp. 45–65. Wiley, validated the results by performing simulated trading and New York (1994) recording the profits and hit ratio on next year’s data and got 9. Yao, J., Tan, C.L.: A case study on using neural networks to perform good results. On examining the models, we found a simple technical forecasting of forex. Neurocomputing 34, 79–98 (2000) rule that gave good results for models with single features, 10. Zimmerman, H., Neuneier, R., Grothmann, R.: Multi-agent model- ing of multiple FX-markets by neural networks. IEEE Trans. Neural which is to predict opposite of the previous direction. This Netw. 12(4), 735–743 (2001) property is also known as return reversal. For models with 11. Zhang, G., Hu, M.Y.: Neural network forecasting of the British two or more features, consecutive previous movements in Pound/US Dollar exchange rate. OMEGA Int. J. Manag. Sci. 26(4), the same direction will result in a higher probability of the 495–506 (1998) 12. Ni, H., Yin, H.: Exchange rate prediction using hybrid neural next movement being in the opposite direction. Finally, we networks and technical indicators. Neurocomputing 72(13–15), validated these findings by examining the historical data for 2815–2823 (2009) occurrence of return reversal, and showed that the probabil- 13. Kuo, R.J., Chen, C.H., et al.: An intelligent stock trading deci- ity of the price movement changing directions is above the sion support system through integration of genetic algorithm based chance level, and that the property of return reversal holds fuzzy neural network and artificial neural network. Fuzzy Sets Syst. 118(1), 21–45 (2001) true irrespective of the timeframe being used. 14. Deng, S., Sakurai, A., Yoshiyama, K., Mitsubuchi, T.: Hybrid For future work, we plan to study models with more com- method of multiple kernel learning and genetic algorithm for fore- plex features, including technical indicators, and hope to find casting short-term foreign exchange rates. Comput. Econ. 45(1), a trading strategy that incorporates return reversal but has 49–89 (2015) 15. Deng, S., Sakurai, A.: Integrated model of multiple kernel learn- even better performance. We also plan to do further analy- ing and differential evolution for EUR/USD trading. Sci. World J. sis to establish the statistical significance of results obtained 2014(914641), 12 (2014) in this experiment. Finally, we also hope to create models 16. Fletcher, T., Shawe Taylor, J.: Multiple kernel learning with Fisher kernels for high frequency currency prediction. Comput. Econ. which give a better prediction of return reversal based on 42(2), 217–240 (2013) several other features such as technical indicators generated 17. Kercheval, A., Zhang, Y.: Modeling high-frequency limit order from the price data. book dynamics with support vector machines. Quant. Financ. 15(8), 1315–1329 (2015) 18. Tay, F.E.H., Cao, L.: Application of support vector machines in Open Access This article is distributed under the terms of the Creative financial time series forecasting. OMEGA Int. J. Manag. Sci. 29(4), Commons Attribution 4.0 International License (http://creativecomm 309–317 (2001) ons.org/licenses/by/4.0/), which permits unrestricted use, distribution, 19. Kim, K.: Financial time series forecasting using support vector and reproduction in any medium, provided you give appropriate credit machines. Neurocomputing 55(1–2), 307–319 (2003) to the original author(s) and the source, provide a link to the Creative 20. Burges, C.J.C.: A tutorial on support vector machines for pattern Commons license, and indicate if changes were made. recognition. Data Min. Knowl. Discov. 2(2), 121–167 (1998) Publisher’s Note Springer Nature remains neutral with regard to juris- dictional claims in published maps and institutional affiliations.

Journal

Vietnam Journal of Computer ScienceSpringer Journals

Published: May 26, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off