TY - JOUR AU - Kumar, K AB - Abstract The prices in the stock market are dynamic in nature, thereby pretend as a hectic challenge to the sellers and buyers in predicting the trending stocks for the future. To ensure effective prediction of the stock market, the chronological penguin Levenberg–Marquardt-based nonlinear autoregressive network (CPLM-based NARX) is employed, and the prediction is devised on the basis of past and the recent rank of market. Initially, input data are subjected to the features extraction that is based on the technical indicators, such as WILLR, ROCR, MOM, RSI, CCI, ADX, TRIX, MACD, OBV, TSF, ATR and MFI. The technical indicator is adapted for predicting the stock market. The wrapper-enabled feature selection is employed for selecting the highly significant features that are generated using the technical indicators. The highly significant features of the data are fed to the prediction module, which is developed using the NARX model. The NARX model uses the CPLM algorithm that is formed using the integration of the chronological-based penguin search optimization algorithm and the Levenberg–Marquardt algorithm. The prediction using the proposed CPLM-based NARX shows the superior performance in terms of mean absolute percentage error and root mean square error with values of 0.96 and 0.805, respectively. 1. INTRODUCTION The prediction of stock prices for the future is a complicated task based on several internal and external factors. Even though a lot of researchers researched on the prediction of the stock market prices, there is a vast variation in the forecast compared with the original data [1]. The process of generating the high yield to the stock market relies on the fact of bringing the effective prediction regarding the future arrival of financial asset prices [2, 3]. The index of a stock market is considered an imaginary portfolio and is a common measure utilized for analyzing the performance of each sector. On the other hand, the market trading scheme is assured to be effectual if and only if the precise prediction is assured in that specific market [4–6]. Prediction in the stock market seems to be a great challenge in science both with respect to the methodology and based on the theory of prediction [7]. The prediction of the stock market multimedia (chart) [8] accurately is particularly not possible using the ideas and methods of the existing scenario. The efficient market hypothesis defines that all the current information is inherited in the stock prices and the presence of the latest information cases based on the impulsive stock prices. The reports generated in a random manner is employed for the prediction of the stock prices but are not accurately predicted using the historical values [9–11] thereby posing the challenge that the stock market prediction is a tedious process. From the statement given by the efficient market, the past and current details are inherited in the stock prices accurately. Thus, the aforementioned criterion makes it clear that the changes in the price are purely based on the new information or ‘news’ and does not depend on the existing information [12]. The information related to the stock markets is dynamic and is not known in prior and thus the price undergoes a random analysis to reveal the best price. Once the attempt to finalize this concept becomes true, the prediction of the stock market is useless [10]. Above all, there are a large number of factors that cannot be inherited in the time series, but they possess huge impacts on the time series. The substantiation for the stock market includes alterations in the cost of stock that are decided by the investors. The actions of the investors’ are particularly unreasonable, and they are essentially understandable and rational based on the social organization, social structure, collective beliefs and perceptions of this complex arena [14]. The prediction helps to predict alterations in stock prices by employing the time-series data [15, 16]. The role of the predictive method ensures the prediction of the values of data and the descriptive method demonstrates the relationship among the data. The concept behind data mining is that it extracts the previously unknown and considerably useful data out of the currently available dataset. The types of clustering-based data mining techniques include classification and regression. The most commonly available classifiers include genetic algorithm (GA), artificial neural network (ANN) techniques [17], particle swarm optimization and so on that predict the future market [1]. Until now, the methods of prediction are categorized as four different types. The prediction methods are regression method, neural network-based method [18, 19], expert-based method and time-series method. The prediction methods requires huge amount of the past data and, in addition, requires distributions and employs the statistics that consider the characteristics of the system [13]. This makes the prediction tough to progress and, due to the expense, the prediction seems to be not applicable [20]. The most common methods of the prediction in the financial markets include the time-series analysis in which the predictions and decisions are made on the basis of time-series or historical records of the stock prices. One of the optimization tools employed for stock market prediction that is based on the time series is the ANN that poses the tendency to predict the hidden or buried and unknown records [1]. The most commonly employed areas of the stock market prediction using ANN are areas of finance that comprises of the corporate bond rating, bankruptcy prediction and stock market prediction. Moreover, ANNs are significant in learning the patterns of the financial data. Once the data are send to the ANN, the major process employed is the conversion of the data from its own numeric format to the numeric range that an ANN is capable of dealing effectively. At this stage, the importance of transforming the data symbolizes the learning process that aims at enhancing the generalize ability of the learned results. Another major machine learning technology is the support vector regression [21] and, moreover, SVR algorithm is employed for predicting the prices of the stock market [22, 23]. Also, optimization methods [24] have been used for predicting the price of the stock market. The primary intention of the paper is to develop a stock market prediction system based on the time-series prediction that is enabled using the NARX model. The time-series prediction ensures the effective prediction of the stock in a particular area or the locality based on the stock index data in the previous term or month or year. The proposed NARX model serves as an adaptive prediction model, for which the stock data of the previous period are the inputs, and the optimal computation is based on the proposed algorithm. The adaptive prediction using the proposed algorithm is exhibited in the NARX, and the proposed algorithm is developed based on chronological-based Penguins Search Optimization Algorithm (PeSOA) and the Levenberg-Marquardt (LM) algorithm for predicting the future market based on past and current status of the market. The proposed method incorporates the merits of both the algorithms to obtain improved accuracy and computational time. Here, the proposed model undergoes three stages for predicting the stock market, which involves feature extraction, feature selection, and prediction. Initially, the stock data is subjected to the feature extraction process for extracting the features based on technical indicators. The technical indicators are utilized for stock market prediction. The highly significant features of the data are fed to the prediction module for which the NARX model is employed. The NARX model uses the CPLM algorithm, which is the integration of the chronological-based PeSOA and the LM algorithm. The prediction is affected due to the global trends, local trends and noise such that the additional detrending methods are required to clean the data for further applications [25]. One of the major problems associated with the technical techniques is ‘self-destructing’. The opportunity will go away from the traders if one understands the profitable trading strategy and the traders choose same buy or sell action that avoids copying some successful strategy. Due to the ‘regime-shifting’ [10] character of the market, a flourishing trading strategy must be self-adaptive and dynamic [13]. Causal influence may proliferate in the causal structure that is highly common in the huge data, and the true causal relations may be buried by the large amount of spurious causalities. In addition, multiple-cause structures also present a great challenge for existing causal discovery method. Existing additive noise models for discrete data only work on pairs of variables and do not consider the complex causal structure. Multiple-cause structure here refers to local causal structures in which a variable has more than one cause [26]. Simple voting and stacking ensemble methods failed to operate as a result of greater correlations of predictions between classifiers. The drawback of generative topographic maps is that the number of categories employed for discretization was restricted and the number of the categories changes based on the nature of the individual features [27]. Deep learning algorithms yielded poor performance in case of the higher-dimensional data and large window sizes [10]. This paper takes these as a motivation, to overcome all the challenges and predict the accurate stock market price. The major contributions of the research are the following: Proposed CPLM-based NARX neural network (CPLM-NARX): the proposed CPLM-based NARX is the application of the proposed CPLM algorithm for training the NARX neural combining LM algorithm with the optimization algorithm, namely chronological-based PeSOA. A wrapper-based feature selection method is used for selecting the highly significant features that are generated using the technical indicators for predicting the stock market. The organization of the paper is done in following manner: Section 1 deliberates introductory part based on stock market prediction and Section 2 presents the literature review of the existing methods based on stock market prediction along with the challenges. The proposed method of stock market prediction is deliberated in Section 3 and the results of existing and proposed are evaluated for predicting the stock market and is illustrated in Section 4 and the conclusion is elaborated in Section 5. 2. LITERATURE REVIEW In this section, the review of the stock market prediction techniques along with the disadvantages of the methods is elaborated. Also, the challenges of the existing methods are illustrated. Chong et al. [28] developed a three-stage stock market prediction system using fuzzy clustering and neural network. The multiple regression analysis was employed in the first step that defined the economic and financial variables tending to hold sturdy relation with the output. In the next step, the differential evolution-based type-2 fuzzy clustering technique was used for clustering in order to provide better position to the cluster centers and in the third phase, the fuzzy type-2 neural network was utilized that underwent the reasoning regarding the future stock price prediction. The differential evolution-based algorithms were employed that overcame the problem of standard iterative scheme in dealing with the issues based on distance functions. Klausner [13] investigated the prediction performance of the Dow Jones Industrial Average time series. The predictability was ensured using the Hurst exponent that assured the data selection for forecast. The time series with large Hurst exponents developed the prediction model that was capable of saving the time and results in superior results. Chen et al. [26] examined the stock market movements using Granger causality tests, co-integration and nonlinear approaches with mutual information and correlations. The non-stationarities and trends attacked the underlying datasets for which the adaptive multifractal detrended cross-correlation analysis (AMF-DXA) and adaptive multifractal detrended fluctuation analysis (AMF-DFA) were employed. The useful information was extracted depending on the mutual interactions and the usage of the adaptive algorithm removed the local trends available in the dataset. Wang et al. [29] employed the systematic analysis based on the deep learning networks for undergoing the analysis of stock market prediction. The importance of the deep learning algorithm was that the features were extracted from a huge-sized raw data independent on the predictors’ knowledge. The significance of the unsupervised methods of feature extraction methods, such as autoencoder, principal component analysis and the restricted Boltzmann machine was well understood using the input data and predicted the ability of the network. The importance of the deep neural networks was to shield the added information using the residuals of the autoregressive model and enhanced the performance of predicting the stock market. Born and Acherqui [10] introduced the stock prediction method using the deep learning and the evaluation was carried out using Google stock price multimedia data (chart) from NASDAQ. The method aimed at improving the accuracy of the stock market forecast. Ouahilal et al. [23] presented the extreme learning machine (ELM) that predicted the price of the stock market using two data sources. The trading signal mining platform made use of the market news articles and stock tick prices, where the prediction accuracy and prediction speed were enhanced using the basic ELM and kernel ELM. The comparisons conclude that the kernelized ELM and the radial basis function support vector machine attained greater prediction accuracy and quick prediction speed when compared with the back-propagation neural network and the basic version of ELM. Pehlivanli et al. [30] developed a prediction model using the mutual information-based sentimental analysis methodology along with ELM to improve the performance of prediction. The second work was based on the prediction model using the combined mutual information-based sentimental analysis and the kernel-based ELM named as mutual information-based sentimental analysis with kernel-based ELM (MISA-K-ELM) that brought a proper balance in the prediction accuracy and prediction speed. MISA-K-ELM is a kernel-based learning scheme that can be extended to multiple kernel-based learning schemes and made it applicable for multiple data source integration prediction. Enke et al. [27] developed a method termed as the multiple-cause discovery method along with the structure learning (McDSL) that removed the false causalities. The method was progressed in two phases. The first phase used the conditional independence test that differentiated straight causal candidates as of the circuitous ones. The next phase computed the causal direction of multi-cause structure using the hybrid causal discovery method. The method McDSL reliably discovered the multi-cause structures and removed indirect causes and the performance was found to be better. Gocken et al. [31] developed a method, which gave a good prediction of the next day price trend in the stock market by removing redundant and irrelevant indicators from the dataset by using binary classification model. However, it did not support multi-classification model. Malagrino et al. [32] developed a stock market forecasting performance based on heuristic optimization methodology or GA and ANNs. Anyhow, this method did not consider the parameters like type of transfer function and number of hidden layer, which affect the ANN architecture. David et al. [33] developed a method, to verify which stock market indices influence all over the world by using feasibility of Bayesian network. However, the verification could not be done, when the data came from non-crisis period. Zhanga et al. [34] developed a model, to predict future trends on stock market prices, which depend on the price history and technical analysis indicators using long-short term memory networks. Anyhow, the financial result was not accurate when compared to the baselines. Gheraibia et al. [35] developed a model using unsupervised heuristic algorithm to predict stock price movement and its interval of growth rate within the predefined prediction durations. However, the pattern recognition was not applicable for all shapes by using the unsupervised heuristic algorithm. 3. PROPOSED CPLM ALGORITHM FOR PREDICTING THE STOCK MARKET In this section, the proposed CPLM algorithm for stock market prediction is illustrated. In this method, the data to be predicted are subjected to the feature extraction and prediction. Initially, the data to be predicted are fed to the feature extraction block that extracts the features from the data using the feature indicators. The feature technical indicators are WILLR, ROCR, MOM, RSI, CCI, ADX, TRIX, MACD, OBV, TSF, ATR and MFI that highlight the various features of the data. The indicators like WILLR, RSI, CCI and MACD specify the buy-and-sell signals, and the technical indicators such as ATR, OBV and TRIX constitute the indicators of volatility signal, volume weights and noise elimination and data smoothing, respectively. The features obtained from the data are subjected to wrapper-enabled feature selection, which selects the essential data features that represent the data significance. Finally, the last step is the prediction that is performed using the NARX neural network. The training phase of the NARX is carried out using the proposed algorithm named as the CPLM algorithm that is formed by the integration of the LM algorithm and the chronological concept with penguin optimization algorithm [35]. The proposed prediction model is depicted in Fig. 1.The steps involved in the prediction are progressed as follows: Feature extraction using the technical indicators Feature selection using wrapper Prediction using the NARX model FIGURE 1. Open in new tabDownload slide Block diagram of the proposed method of stock market prediction. FIGURE 1. Open in new tabDownload slide Block diagram of the proposed method of stock market prediction. 1. Feature extraction using the technical indicators: the technical indicators play a major role in extracting the features from the data that enable the effective prediction. This method employs 12 technical indicators for feature extraction. a) WILLR: WILLR is also known as Williams %R and the major role of the indicator is to determine whether the closing price of the current day’s price fell in the range on the past 10 days’ transaction. It is calculated by taking the ratio between the difference in the highest and closest price to the difference in the highest and the lowest price. b) ROCR: it is referred as the rate of change and it computes the rate of change in price proportion to the trading intervals of the price. c) MOM: MOM indicates the momentum and it measures the change in the price from the previous intervals. d) RSI: RSI refers to the relative strength index and it symbolizes the over brought and the oversold market signal. e) CCI: CCI indicates the commodity channel index that determines the cyclical turns in the stock price. f) ADX: ADX is the average directional index that determines current trends. g) TRIX: TRIX refers to the triple exponential moving average that smoothen the insignificant movements. h) MACD: MACD refers to the moving average convergence divergence that employs different EMA to signal buy and sell. i) OBV: OBV refers to the on-balance volume that relates the trading change and the change in price. j) TSF: TSF refers to the time-series forecasting that computes the linear regression of the 20-day price. k) ATR: ATR indicates the average time range that indicates the volatility of the stock market. l) MFI: MFI refers to the money flow index that relates the typical price with volume. These 12 technical indicators can extract the features like price change (ROCR and MOM), buy-and-sell signals (WILLR, RSI, CCI and MACD), stock trend discovery (ADX and MFI), volatility signals (ATR), noise elimination and data smoothing (TRIX) and volume weights (OBV) [36]. All these features are organized to form the feature vector of the data. 2. Feature Selection using wrapper algorithm: the features are selected from the data using C4.5 decision tree algorithm based on wrapper feature selection. C4.5 method is reducing the data size without any distortion and improving the performance of feature selection. The features are selected using wrapper method, without the consideration of large amount of data [37]. 3. Prediction of the stock market using the NARX model: the selected features are fed to the NARX model that predicts the stock market of the present adaptively. The main advantages of the NARX model is that it is effective for handling the time-series applications and it is developed based on the nonlinear regressor through the estimation of the output of the individual iteration with the help of the past predictions or the status of the stock market. The weights of the NARX model are optimally tuned based on the proposed CPLM algorithm. The proposed CPLM algorithm uses the chronological-based PeSOA and LM algorithm, and the weight update is based on the error estimate. The chronological concept considers the past values to predict the future. In PeSOA and LM algorithm, the weights are computed based on the previous weight values (last) and it does not take the in-depth past records to find the new weights. This chronological concept includes the past values through deriving a new equation to determine the weights values of neural network. The proposed CPLM algorithm is employed for updating the weights of NARX. The predicted result is the predicted status of the stock market in the present. 3.1. Proposed stock market prediction using CPLM-based NARX neural network The NARX neural network [38] for stock market prediction is illustrated in this section. The NARX neural network is a type of recurrent neural network used in nonlinear time-series prediction and is of great implication as compared to other classical prediction models. The main benefit of NARX neural network is that it poses effective learning rate and has the capability to produce the optimal solution in a limited time. The NARX neural network contains the capacity to generalize and is more beneficial than other classifiers. NARX contains multilayer feed forward network, recurrent loop and time delay and plays an important role in prediction using the time-series data. The time-series data is the past data record, which tends to change according to the time. Here, the NARX neural network contains the hidden layers, output layers and input layers. The input layer poses exogenous input vector, regressed output vector and delayed exogenous input vector. The congestion in the NARX network is based on five features, which are considered as the inputs to the NARX neural network. Assume the inputs of NARX be denoted as |$m$| and is represented as $$\begin{equation} {s}_m=\left\{{s}^o,{s}^r,{s}^b,{s}^c,{s}^l\right\} \end{equation}$$(1) where |${s}^o$| indicates rate of old share, |${s}^r$| represents the network service rate, |${s}^b$| refers to the network bandwidth, |${s}^c$| specifies the network congestion level and |${s}^l$| denote the queue length. The NARX neural network is used to produces optimized output and is formulated as $$\begin{equation} \overline{Z}= fun\left[\!\!\!\begin{array}{l}{s}^o\left(q-1\right) \!,\ {s}^o\left(q-2\right)\!, \ {s}^r\left(q-1\right)\!, \ {s}^r\left(q-2\right)\!,\\{s}^b\left(q-1\right)\!, \ {s}^b\left(q-2\right),\ {s}^c\left(q-1\right),\ {s}^c\left(q-2\right)\!,\\{s}^l\left(q-1\right)\!,\ {s}^l\left(q-2\right),\ {Z}^t\left(q-1\right)\!,\ {Z}^t\left(q-2\right)\end{array}\!\!\! \right] \end{equation}$$(2) where |$(q-1)$| and |$(q-2)$| represents two tapped delays and |${Z}^t$| represents the optimized output. The network transfer function is formulated as $$\begin{equation} fun(\cdot)=A\left[\sum{y}_{F,G}\ F\left(\circ \right) \right]+{S}_0 \end{equation}$$(3) where |${y}_{F,G}$| denote the weights with respect to the hidden layer and the output layer, |$F(\circ )$| is represents the activation function of the hidden layer, |${S}_0$| refers to the bias with respect to the output layer and |$A$| is the activation function of the output layer. $$\begin{equation} F\left(\circ \right)= hfun \ \ \left(\sum{y}_{m,F}\kern0.5em .\kern0.5em {s}_m\right)+{S}_1 \end{equation}$$(4) where |$hfun$| specifies the hyperbolic tangent function and is expressed as $$\begin{equation} hfun=\frac{1}{1+\sum{y}_{m,F}\kern0.5em .\kern0.5em {s}_m};\ \ 1> hfun>0. \end{equation}$$(5) After differentiating the above equation with weight and bias, the obtained equation is represented as $$\begin{equation} \frac{\partial hfun}{\partial{y}_{G,F}}=-\frac{\exp \ \left[ fun\left(\cdot \right)\right]}{{\left[1+\exp \left[ fun\left(\cdot \right)\right]\right]}^{2}}\ .\ {s}_m \end{equation}$$(6) $$\begin{equation} \frac{\partial hfun}{\partial{S}_{G,F}}=-\frac{\exp \ \left[ fun\left(\cdot \right)\right]}{{\left[1+\exp \left[ fun\left(\cdot \right)\right]\right]}^{2}} \end{equation}$$(7) Figure 2 illustrates the architecture of NARX neural network. Here, two layers are contained, which involves hidden layer and the output layer. The network is given to two tapped delays and the feedback is flowed in a single direction. FIGURE 2. Open in new tabDownload slide Architecture of the NARX model. FIGURE 2. Open in new tabDownload slide Architecture of the NARX model. Here, the stock prediction is performed using the NARX neural network, which is tuned using the proposed algorithm. The proposed algorithm is responsible for selecting the optimal weights based on minimum error value, which is computed between the predicted output and the ground data. At first, the weights are computed using the CPLM algorithm that tunes the NARX and generates the output in an individual manner. The error is computed using the ground truth after computing the output of corresponding algorithm. At last, the algorithm that aids with the minimum value of the error are adapted to update the weights of the NARX. Assume the weights computed using PeSOA and chronological concept be given as $$\begin{equation} y=\left\{{y}_1^{CP},{y}_2^{CP},\dots, {y}_x^{CP}\right\}. \end{equation}$$(8) The weights computed using the LM algorithm is represented as $$\begin{equation} y=\left\{{y}_1^L,{y}_2^L,\dots, {y}_x^L\right\}. \end{equation}$$(9) Finally, the weights computed by the hybrid model of chronological-based PeSOA and LM algorithm is represented as $$\begin{equation} y=\left\{{y}_1^{CPLM},{y}_2^{CPLM},\dots, {y}_x^{CPLM}\right\}. \end{equation}$$(10) The optimum weights selected by the proposed algorithm are adapted for tuning the NARX model to adaptively predict the stock price. 3.2. Proposed CPLM algorithm for optimal weight selection The main aim of the proposed CPLM-NARX neural network is to attain the optimal weights by proposing CPLM algorithm. Here, the proposed CPLM is designed by incorporating chronological concept in PeSOA [35] and then combining the result with the LM algorithm [39] that is applied to the NARX neural network for determining the optimal weights to enable adaptive prediction. The LM algorithm processes faster and renders effective stability. The stock market price is dynamic in nature and the accuracy of the stock prices predicted by the PeSOA algorithm and LM algorithm is high. Moreover, LM can be known as a non-robust algorithm. In nonlinear regression, it depends on the starting point in order to converge to an optimal solution. The PeSOA is a meta-heuristic algorithm devised on the basis of collaborative hunting strategy of penguins. The PeSOA is considered to more robust and efficient. The algorithm tends to detect all local minima and the global minimum in search space even if the number of penguins is large. The chronological concept is incorporated in PeSOA for analyzing the past data and events. Also, the chronological concept is used to update the solution based on the past events that had occurred with respect to the time. Hence, the incorporation of chronological concept, PESOA and LM algorithm makes the proposed CPLM is the robust one. The steps involved in the CPLM algorithm is illustrated as follows: Step 1: initialization: the weights of the NARX neural network are randomly initialized and represented as $$\begin{equation} y=\left\{{y}_1,{y}_2,\cdots, {y}_k\right\}. \end{equation}$$(11) Assume the learning rate be denote as |${\alpha}_j=0.1$| and the delay rate be represented as|${q}_j=0.1$|⁠. Step 2: computation of weights using the two algorithms: the weight is updated using the NARX model and is formulated as $$\begin{equation} {y}_{j+1}=\left\{\begin{array}{l}{y}_{j+1}^L\kern0.72em ; if \ {(Err)}^L<{(Err)}^{CP}\\{}{y}_{j+1}^{CP}\ ; if\ {(Err)}^{CP}<{(Err)}^L\end{array}\right. \end{equation}$$(12) where |${y}_{j+1}$| represents the new weight of neural network (NN), |${y}_{j+1}^L$| indicates weight update using LM algorithm, |${y}_{j+1}^{CP}$| denote weight update using the chronological-based PeSOA algorithm. The mean absolute percentage error (MAPE) values are computing for generating the outputs generated based on LM-based neural network and chronological-based PeSOA and are represented as |${(Err)}_{j+1}^L$| and |${(Err)}_{j+1}^{CP}$|⁠, respectively. Step 3: computation of weights using the LM algorithm The weights is updated using the LM algorithm and generates the optimal weights for training the neural network. Thus, the weight update adapting the LM algorithm is given as $$\begin{equation} {y}_{j+1}^L={y}_j-\Delta{y}_j \end{equation}$$(13) where |$\Delta{y}_j$| refers to the incremental weights and |${y}_j$| indicates the current weight of neural network. Here, the incremental weight is calculated using equation (14) $$\begin{equation} \Delta{y}_j={\left[Q+{\alpha}_jI\right]}^{\hbox{-} 1}\ast R \end{equation}$$(14) where |$I$| indicates the identity matrix, |$R$| represents the gradient matrix and |$R={V}^T\ast Err$|⁠, and hessian matrix is represented as, |$Q$| denotes the learning rate. Hence, the weight update using LM algorithm is represented as $$\begin{equation} {y}_{j+1}^L={y}_j-{\left[Q+{\alpha}_jI\right]}^{-1}\ast R \end{equation}$$(15) $$\begin{equation} {y}_{j+1}^L={y}_j-{\left[{V}^TV+{\alpha}_jI\right]}^{-1}\ast{V}^T\ {(Err)}_j \end{equation}$$(16) |$V$| represents the Jacobian matrix, and |${(Err)}_j$| represents the error of current iteration. i) Computation of output using LM: the output is computed by performing multiplication of scalar product of input features and weights. The network output is 1, if output surpasses activation function and is represented as $$\begin{equation} {O}^L(\,f)=\left\{\begin{array}{@{}l}&1 &if \ \ {y}^Lh\ge \theta \\&0 &otherwise\end{array}\right. \end{equation}$$(17) where |${O}^L(\,f)$| denotes the scalar product of input features and weights and |$\theta$| denotes the threshold value. Hence, the output of neural network is represented as |${Y}_i^L= nfun\big({y}_j^L,\kern0.5em h\big)$|⁠. Here, |$nfun\big({y}_j^L,\kern0.5em h\big)$| represents the network output function using the weights of LM algorithm,|${y}^L.$| ii) Computation of errors: once the current output of the neural network is determined using |${Y}_i^L= nfun\big({y}_j^L,\ h\big)$| based on the weights generated using the LM algorithm, the error of the output gets computed. After determining the output of neural network using LM algorithm, the error of the output is calculated. Here, the error of the output is determined using the mean square error of the network and is formulated as $$\begin{equation} {(Err)}_j^L=\frac{1}{Z} \sum \limits_{g=1}^Z{\left({Y}_g^L\hbox{-} {Y}_f^T\right)}^2 \end{equation}$$(18) where |${Y}_g^L$| represents the output of |${g}^{th}$| neuron in the output layer and |${Y}_f^T$| indicates the targeted output of neural network. Step 4: computation of weights using chronological-based PeSOA The weight update of the NARX neural network using the PeSOA is expressed as $$\begin{equation} {y}_{j+1}^{CP}={y}^J+\textrm{rand}()\left|{y}_j-{y}_j^{CP}\right| \end{equation}$$(19) where |${y}_j$| indicates the best weight, |${y}_j^{CP}$| specifies the weight of current iteration, |${y}^J$| indicates the old weight, rand() indicates the random number for distribution and the new weight update using chronological-based PeSOA is represented as |${y}_{j+1}^{CP}$|⁠. The chronological concept is used to provide the brief description of the algorithm for generating effective solutions. The update equation of PeSOA is modified based on the chronological concept so that the update not only depends on present events but also on the past events and thus, the equation of chronological-based PeSOA is given by $$\begin{equation} {y}_j^{CP}={y}^J+\textrm{rand}()\left|{y}_j-{y}_{j-1}^{CP}\right| \end{equation}$$(20) as |$\Big|{y}_j-{y}_{j-1}^{CP}\Big|$| is an absolute values; hence, the equation becomes $$\begin{equation} {y}_j^{CP}={y}^J+\textrm{rand}()\left({y}_j-{y}_{j-1}^{CP}\right). \end{equation}$$(21) Substituting equation (21) in (19), the obtained equation is formulated as $$\begin{equation} {y}_{j+1}^{CP}={y}^J+\textrm{rand}()\left({y}_j-\left({y}^J+\textrm{rand}()\left({y}_j-{y}_{j-1}^{CP}\right)\right)\right). \end{equation}$$(22) After rearranging the above equation, the update equation is represented as $$\begin{equation} {y}_{j+1}^{CP}={y}^J\left(1-\textrm{rand}()\right)+\textrm{rand}(){y}_j-{\left[\textrm{rand}()\right]}^2\left[{y}_j-{y}_{j-1}^{CP}\right]. \end{equation}$$(23) The above equation can also be written as depicted in equation (24) $$\begin{equation} {y}_{j+1}^{CP}={y}^J\big(1-\textrm{rand}()+\textrm{rand}(){y}_j\left[1-\textrm{rand}()\right]+{\left(\textrm{rand}()\right)}^2{y}_{j-1}^{CP}\big). \end{equation}$$(24) The weight update based on chronological concept is used for analyzing the previous records. The solution for |${\big(j+1\big)}^{th}$| iteration is given based on the chronological concept and is given by $$\begin{equation} {y}_{j+1}^{CP}=\frac{y_j^{CP}+{y}_{j+1}^{CP}}{2}. \end{equation}$$(25) Thus, the update solution of proposed CPLM after substituting equation (21), and (24) in equation (25) is given by $$\begin{align} {y}_{j+1}^{CP}&=\frac{1}{2}\left[{y}^J+\textrm{rand}()\left({y}_j-{y}_{j-1}^{CP}\right)+{y}^J\left(1-\textrm{rand}()\right) \right. \nonumber \\ &\left. \quad+{y}_j\textrm{rand}()\left(1-\textrm{rand}()\right)+{\left[\textrm{rand}()\right]}^2{y}_{j-1}^{CP}\right] \end{align}$$(26) $$\begin{align} {y}_{j+1}^{CP}&=\frac{1}{2}\left[{y}^J+\textrm{rand}(){y}_j-\textrm{rand}(){y}_{j-1}^{CP}+{y}^J-\textrm{rand}(){y}^J \right.\nonumber\\ &\left. \quad+\textrm{rand}()y{}_j-{\left[\textrm{rand}()\right]}^2{y}_j+{\left[\textrm{rand}()\right]}^2{y}_{j-1}^{CP}\right] \end{align}$$(27) Equation (27) is rearrange as follows: $$\begin{align} {y}_{j+1}^{CP}&=\left[{y}^J+{y}^J-\textrm{rand}(){y}^J+\textrm{rand}(){y}_j+\textrm{rand}(){y}_j\right.\\\nonumber &\left. \quad-{\left[\textrm{rand}()\right]}^2{y}_j-\textrm{rand}(){y}_{j-1}^{CP}+{\left[\textrm{rand}()\right]}^2{y}_{j-1}^{CP}\right] \end{align}$$(28) The weights updated using the algorithms, LM and chronological-based PeSOA are evaluated in such a way that the weights corresponding to the minimum value of error are employed for training NARX neural network and is given by the equation $$\begin{align} {y}_{j+1}^{CP}&=\frac{1}{2}\left[{y}^J\left[2-\textrm{rand}()\right]+\textrm{rand}(){y}_j\left[2-\textrm{rand}()\right] \right. \nonumber \\ &\left. \quad-\textrm{rand}(){y}_{j-1}^{CP}\left[1-\textrm{rand}()\right]\right] \end{align}$$(29) Step 5: terminate: The optimal weights are derived in an iterative manner until the maximum number of iterations. Thus, whenever a new data arrives for stock prediction, the features of the input data are extracted and fed to the classifier that processes the features with respect to the original feature database and derives the class label for the input data for stock market prediction. The proposed method of stock market prediction is effective in predicting the stocks, which is fed to the applications. 4. RESULTS AND DISCUSSION The result generated by the proposed method for stock market prediction is illustrated in this section. The performance of the proposed method is analyzed with respect to the existing methods based on root mean square error (RMSE) and MAPE. 4.1. Experimental setup The experimentation of the proposed CPLM-based NARX is performed with the implementation in MATLAB operating in the Windows 10 OS with 2GB RAM and i3 processor. 4.2. Performance metrics The metrics employed for the analysis include RMSE and MAPE and they are formulated as: 4.2.1. MAPE The MAPE is computed by the absolute difference between the estimated values and the observed values divided by the estimated values. The minimum value of MAPE formula is chosen as the best solution. $$\begin{equation} MAPE \equiv \frac{1}{k}{\sum}_{n=1}^k\left|\frac{o_n^L-{o}_n^P}{o_n^L}\right| \end{equation}$$(30) where |${o}_n^L$| represents the expected outcome and |${o}_n^P$| is the observed result of NARX and |$k$| is the number of data. 4.2.2. RMSE The square root of the differences between square of estimated values and observed values is termed as RMSE. $$\begin{equation} RMSE=\sqrt{\frac{1}{k}{\sum}_{n=1}^k{\left[{o}_n^L-{o}_n^P\right]}^2} \end{equation}$$(31) 4.3. Dataset description The dataset considered for experimentation is taken from [40] that contain the stock data of the companies predicted in entire world, from the year 2018 to 2019. Here, six datasets are considered to perform the stock market prediction using the data month-wise, year-wise and daily basis data. The six databases include the historic national stock exchange (NSE) data of two companies, namely, Reliance Communications and Relaxo Footwear, on the basis of daily, monthly and yearly stock data. Dataset 7 is the month-wise historic NSE data of Reliance Communications from the year January 2000 to March 2019. Dataset 1: the dataset 1 is taken from the stock data of Reliance Communications that carries the stock data. Here, the dataset contains the stock data based on the daily basis. The dataset consist of stock data from start day 1 January 2018 to the end day 1 March 2019 based on NSE values. Dataset 2: the dataset 2 is taken from the stock data of Reliance Communications that carries the stock data. Here, the dataset contains the stock data on the monthly basis. The dataset consist of stock data from January 2018 and March 2019 based on NSE. Dataset 3: the dataset 3 is taken from the stock data of Reliance Communications that carries the stock data. Here, the dataset contains the stock data on the yearly basis. The dataset consist of NSE data and set to 2018 for NSE analysis. Dataset 4: the dataset 4 is taken from the stock data of Relaxo Footwear that carries the stock data. Here, the dataset contains the stock data on the daily basis. The dataset consist of NSE data from 1 January 2018 to 1 March 2019 based on NSE. Dataset 5: the dataset 5 is taken from the stock data of Relaxo Footwear that carries the stock data. Here, the dataset contains the stock data on the monthly basis. The dataset consist of NSE data from January 2018 and March 2019 based on NSE. Dataset 6: the dataset 6 is taken from the stock data of Relaxo Footwear that carries the stock data and the dataset contains the stock data on the yearly basis. The dataset consist of NSE data and isset to year 2018 for NSE analysis. Dataset 7: the dataset 7 is taken from the stock data of Reliance Communications that carries the stock data. Here, the dataset contains the stock data on the monthly basis. The dataset consist of stock data from January 2000 to March 2019 based on NSE. 4.4. Experimental results The experimental result obtained by the proposed CPLM-based NARX using Reliance Company and Relaxo Footwear Company is depicted in Fig. 3. Figure 3a depicts the input as NSE data considering two different companies, like Reliance Communications and Relaxo Footwear. Figure 3b presents the analysis to extract the features of two different companies, like Reliance Communications and Relaxo Footwear using the 12 technical indicators, such as WILLR, ROCR, MOM, RSI, CCI, ADX, TRIX, MACD, OBV, TSF, ATR and MFI and finally, Fig. 3c depicts the resultant graph containing the predicted data considering two different companies, like Reliance Communications and Relaxo Footwear. FIGURE 3. Open in new tabDownload slide Experimental results of proposed CPLM-based NARX using Reliance Communications and Relaxo Footwear: (a) based on NSE data, (b) based on extracted features values and (c) based on predicted data. FIGURE 3. Open in new tabDownload slide Experimental results of proposed CPLM-based NARX using Reliance Communications and Relaxo Footwear: (a) based on NSE data, (b) based on extracted features values and (c) based on predicted data. 4.5. Competing methods The comparative methods include the regression model [41], deep belief network (DBN) [42], NeuroFuzzy-NN [43] and proposed CPLM-based NARX and is analyzed with respect to the training data in order to prove the effectiveness of the proposed method. 4.5.1. Comparative analysis using dataset 1 Figure 4 illustrates the analysis of the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX in terms of MAPE and RMSE using the dataset 1. The analysis has been provided by various percentages of the training data (50–90%). The analysis with MAPE for different training data percentage is illustrated in Fig. 4a. When the training data percentage is 50%, the corresponding MAPE values obtained by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 2.69, 3.36, 3.12 and 2.45, respectively. Similarly, for 90% training data, the MAPE values computed by existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 3.53, 11.16, 4.46, and 3.31, respectively. The analysis in terms of RMSE for varying training data is illustrated in Fig. 4b. For 50% training data, the RMSE values computed by existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 4.632, 3.626, 5.769 and 0.590, respectively. Likewise, when the training data is 90%, the RMSE values measured by existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 9.279, 25.058, 8.656 and 0.805, respectively. From the above analysis, the proposed CPLM-based NARX shows minimum RMSE and MAPE values with respect to the existing methods. FIGURE 4. Open in new tabDownload slide Comparative results using dataset 1: (a) MAPE and (b) RMSE. FIGURE 4. Open in new tabDownload slide Comparative results using dataset 1: (a) MAPE and (b) RMSE. 4.5.2. Comparative analysis using dataset 2 The analysis of existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX in terms of MAPE and RMSE using dataset 2 is illustrated in Fig. 5. The analysis with MAPE by changing training data is depicted in Fig. 5a. When the training data percentage is 50, the corresponding MSE values obtained by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 8.03, 11.62, 13.69 and 3.25, respectively. Similarly, for 90% training data, the MAPE values measured by the existing regression Model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 18.86, 22.91, 27.24 and 9.79, respectively. The analysis in terms of RMSE for varying training data is illustrated in Fig. 5b. For 50% training data, the RMSE values computed by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 157.041, 28.180, 133.994 and 24.046, respectively. Likewise, when the training data is 90%, the RMSE values measured by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 246.265, 424.423, 165.227 and 51.192, respectively. It is observed that RMSE and MAPE values increase as the training data increases. FIGURE 5. Open in new tabDownload slide Comparative results using dataset 2: (a) MAPE and (b) RMSE. FIGURE 5. Open in new tabDownload slide Comparative results using dataset 2: (a) MAPE and (b) RMSE. 4.5.3. Comparative analysis using dataset 3 The analysis based on MAPE and RMSE for regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX using dataset 3 is illustrated in Fig. 6. The analysis using MAPE for different training data is shown in Fig. 6a. For 50% training data, the corresponding MAPE values obtained by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 0.06, 0.58, 0.26 and 0.32, respectively. Similarly, for 90% training data, the MAPE values measured by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 109.25, 68.53, 80.83 and 23.66, respectively. The analysis in terms of RMSE for varying training data is illustrated in Fig. 6b. For 50% training data, the RMSE values computed by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 20.215, 73.780, 51.289 and 1.36E-13, respectively. Likewise, when the training data is 90%, the RMSE values measured by the existing regression model, DBN, NeuroFuzzy-NN and the proposed CPLM-based NARX are 55.231, 706.914, 66.814 and 24.296, respectively. It is noted that RMSE and MAPE values are less in the case of the proposed technique, where as higher in the existing methods. FIGURE 6. Open in new tabDownload slide Comparative results using dataset 3: (a) MAPE and (b) RMSE. FIGURE 6. Open in new tabDownload slide Comparative results using dataset 3: (a) MAPE and (b) RMSE. FIGURE 7. Open in new tabDownload slide Comparative results using dataset 4: (a) MAPE and (b) RMSE. FIGURE 7. Open in new tabDownload slide Comparative results using dataset 4: (a) MAPE and (b) RMSE. 4.5.4. Comparative analysis using dataset 4 The analysis of existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX in terms of MAPE and RMSE using the dataset 4 is shown in Fig. 7. The analysis in terms of MAPE with altering training data is illustrated in Fig. 7a. For 50% training data, the corresponding MAPE values obtained by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 0.85, 1.3, 1.23 and 0.85, respectively. Similarly, for 90% training data, the MAPE values measured by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 1.34, 3.65, 1.48 and 1.05, respectively. The analysis in terms of RMSE for varying training data is illustrated in Fig. 7b. For 50% training data, the RMSE values computed by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 9.565, 61.648, 16.420 and 7.605, respectively. Likewise, when the training data is 90%, the RMSE values measured by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 103.158, 739.039, 70.023 and 12.203, respectively. From the above analysis, the proposed CPLM-based NARX shows minimal value for RMSE and MAPE values with respect to conventional methods. FIGURE 8. Open in new tabDownload slide Comparative results using dataset 5: (a) MAPE and (b) RMSE. FIGURE 8. Open in new tabDownload slide Comparative results using dataset 5: (a) MAPE and (b) RMSE. FIGURE 9. Open in new tabDownload slide Comparative results using dataset 6: (a) MAPE and (b) RMSE. FIGURE 9. Open in new tabDownload slide Comparative results using dataset 6: (a) MAPE and (b) RMSE. 4.5.5. Comparative analysis using dataset 5 Figure 8 shows the analysis of the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX based on MAPE and RMSE using the dataset 5. The analysis based on MAPE with varying training data percentage ranging from 50% to 90% is illustrated in Fig. 8a. For 50% training data, the corresponding MAPE values obtained by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 3.13, 3.86, 3.77 and 3.37, respectively. Similarly, for 90% training data, the MAPE values measured by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 5.67, 26.59, 8.39 and 4.86, respectively. The analysis in terms of RMSE for different percentage of training data is illustrated in Fig. 8b. For 50% training data, the RMSE values computed by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 408.514, 343.687, 72.867 and 19.466, respectively. Likewise, when the training data is 90%, the RMSE values measured by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 463.038, 666.725, 203.421 and 23.860, respectively. Hence, it is noted that the proposed CPLM-based NARX shows reduced value for RMSE and MAPE with respect to the existing methods. FIGURE 10 Open in new tabDownload slide Comparative results using dataset 7: (a) MAPE and (b) RMSE. FIGURE 10 Open in new tabDownload slide Comparative results using dataset 7: (a) MAPE and (b) RMSE. 4.5.6. Comparative analysis using dataset 6 Figure 9 illustrates the analysis of the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX in terms of MAPE and RMSE using dataset 6. The analysis in terms of MAPE with different values of training data percentage is illustrated in Fig. 9a. When the training data percentage is 50, the corresponding MAPE values obtained by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 0.11, 0.07, 0.11 and 0.11, respectively. Similarly, for 90% training data, the MAPE values measured by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 1.30, 1.1, 1.01 and 0.96, respectively. The analysis in terms of RMSE for varying training data is illustrated in Fig. 9b. For 50% training data, the RMSE values computed by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 14.796, 327.306, 6.851 and 3.66E-12, respectively. Likewise, when the training data is 90%, the RMSE values measured by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 29.939, 703.125, 12.976 and 4.154, respectively. From the above analysis, the proposed CPLM-based NARX shows minimum RMSE and MAPE values with respect to the existing methods. 4.5.7. Comparative analysis using dataset 7 Figure 10 illustrates the analysis of the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX in terms of MAPE and RMSE using the dataset 7. The analysis with MAPE for different training data percentage is illustrated in Fig. 10a. When the training data percentage is 50, the corresponding MAPE values obtained by the existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 5.75, 4.01, 3.69 and 2.68, respectively. Similarly, for 90% training data, the MAPE values computed by existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 8.43, 8.14, 6.99 and 6.26, respectively. The analysis in terms of RMSE for varying training data is illustrated in Fig. 10b. For 50% training data, the RMSE values computed by existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 4.41, 2.71, 1.41 and 1.32, respectively. Likewise, when the training data is 90%, the RMSE values measured by existing regression model, DBN, NeuroFuzzy-NN and proposed CPLM-based NARX are 7.65, 7.2, 3.75 and 3.55, respectively. From the above analysis, the proposed CPLM-based NARX shows minimum RMSE and MAPE values with respect to the existing methods. 4.6. Comparative discussion In this section, the comparative analysis is performed by analyzing the proposed method with the existing techniques. Table 1 depicts the comparative results generated in the NARX in terms of RMSE and MAPE. TABLE 1. Comparative results generated in the NARX in terms of RMSE and MAPE of 90% training data. Methods . Metrics . Regression model . DBN . NeuroFuzzy-NN . CPLM-based NARX . Using dataset 1 MAPE 3.53 11.16 4.46 3.31 RMSE 9.2793 25.058 8.656 0.805 Using dataset 2 MAPE 18.86 22.91 27.24 9.79 RMSE 246.265 424.423 165.227 51.192 Using dataset 3 MAPE 109.25 68.53 80.83 23.66 RMSE 55.231 706.914 66.814 24.296 Using dataset 4 MAPE 1.34 3.65 1.48 1.05 RMSE 103.158 739.039 70.023 12.203 Using dataset 5 MAPE 5.67 26.59 8.39 4.86 RMSE 463.038 666.725 203.421 23.860 Using dataset 6 MAPE 1.30 1.1 1.01 0.96 RMSE 29.939 703.125 12.976 4.154 Methods . Metrics . Regression model . DBN . NeuroFuzzy-NN . CPLM-based NARX . Using dataset 1 MAPE 3.53 11.16 4.46 3.31 RMSE 9.2793 25.058 8.656 0.805 Using dataset 2 MAPE 18.86 22.91 27.24 9.79 RMSE 246.265 424.423 165.227 51.192 Using dataset 3 MAPE 109.25 68.53 80.83 23.66 RMSE 55.231 706.914 66.814 24.296 Using dataset 4 MAPE 1.34 3.65 1.48 1.05 RMSE 103.158 739.039 70.023 12.203 Using dataset 5 MAPE 5.67 26.59 8.39 4.86 RMSE 463.038 666.725 203.421 23.860 Using dataset 6 MAPE 1.30 1.1 1.01 0.96 RMSE 29.939 703.125 12.976 4.154 Open in new tab TABLE 1. Comparative results generated in the NARX in terms of RMSE and MAPE of 90% training data. Methods . Metrics . Regression model . DBN . NeuroFuzzy-NN . CPLM-based NARX . Using dataset 1 MAPE 3.53 11.16 4.46 3.31 RMSE 9.2793 25.058 8.656 0.805 Using dataset 2 MAPE 18.86 22.91 27.24 9.79 RMSE 246.265 424.423 165.227 51.192 Using dataset 3 MAPE 109.25 68.53 80.83 23.66 RMSE 55.231 706.914 66.814 24.296 Using dataset 4 MAPE 1.34 3.65 1.48 1.05 RMSE 103.158 739.039 70.023 12.203 Using dataset 5 MAPE 5.67 26.59 8.39 4.86 RMSE 463.038 666.725 203.421 23.860 Using dataset 6 MAPE 1.30 1.1 1.01 0.96 RMSE 29.939 703.125 12.976 4.154 Methods . Metrics . Regression model . DBN . NeuroFuzzy-NN . CPLM-based NARX . Using dataset 1 MAPE 3.53 11.16 4.46 3.31 RMSE 9.2793 25.058 8.656 0.805 Using dataset 2 MAPE 18.86 22.91 27.24 9.79 RMSE 246.265 424.423 165.227 51.192 Using dataset 3 MAPE 109.25 68.53 80.83 23.66 RMSE 55.231 706.914 66.814 24.296 Using dataset 4 MAPE 1.34 3.65 1.48 1.05 RMSE 103.158 739.039 70.023 12.203 Using dataset 5 MAPE 5.67 26.59 8.39 4.86 RMSE 463.038 666.725 203.421 23.860 Using dataset 6 MAPE 1.30 1.1 1.01 0.96 RMSE 29.939 703.125 12.976 4.154 Open in new tab Table 1 depicts the results of comparative analysis from the NARX model. The comparative result is presented for the proposed CPLM-based NARX and three existing techniques. The proposed CPLM-based NARX shows the maximum performance in terms of MAPE and RMSE with minimal values. From the above comparative analysis, it can be noted that the proposed CPLM-based NARX technique shows better performance with respect to the existing methods. Table 2 shows the computational time of the proposed method and the existing methods. Here, the proposed method has the minimum computational time of 5 sec. Table 2. Analysis based on computational time. Comparative methods . Computational time (Sec) . Regression model 12.5 DBN 11 NeuroFuzzy-NN 7.5 CPLM-based NARX 5 Comparative methods . Computational time (Sec) . Regression model 12.5 DBN 11 NeuroFuzzy-NN 7.5 CPLM-based NARX 5 Open in new tab Table 2. Analysis based on computational time. Comparative methods . Computational time (Sec) . Regression model 12.5 DBN 11 NeuroFuzzy-NN 7.5 CPLM-based NARX 5 Comparative methods . Computational time (Sec) . Regression model 12.5 DBN 11 NeuroFuzzy-NN 7.5 CPLM-based NARX 5 Open in new tab 4.7. Discussion on overfitting issues The proposed CPLM algorithm is responsible for selecting the optimal weights based on minimum error value, which is computed between the predicted output and the ground truth value. The weights are computed using the CPLM algorithm that tunes the NARX and generates the output in an individual manner. The error is computed using the ground truth value after computing the output of corresponding algorithm. At last, the algorithm that aids with the minimum value of the error are adapted to update the weights of the NARX. This solves the problem of overfitting. 5. CONCLUSION The work concentrates on proposing a new model using the CPLM-based NARX model that makes use of the present and past scenario of the stock market deliberating the effective stock prediction. The data is initially subjected to feature extraction for which the technical feature indicators are employed that reveals the various status of the stock market. Once the features are extracted using the feature technical indicators, the features are selected using the wrapper-based feature selection method. The selected features are fed to the prediction model that predicts the status of the stock market in the future through the available features and the past reports of the stock market. Thus, this ensures the users of the stock market for effective planning and making decisions. The evaluation of the method is performed using the stock market data and the analysis to prove the effectiveness is duly based on the performance metrics, such as MAPE and RMSE. The proposed CPLM-based NARX showed superior performance in terms of MAPE and RMSE with values 0.96, and 0.805, respectively. References [1]. Chandar , S.K. ( 2017 ) Stock market prediction using subtractive clustering for a neuro-fuzzy hybrid approach . Clust. Comput. , 22 , 13159 – 13166 . Google Scholar Crossref Search ADS WorldCat [2]. Huang , W. , Nakamori , Y. and Wang , S.-Y. ( 2005 ) Forecasting stock market movement direction with support vector machine . Comput. Oper. Res. , 32 , 2513 – 2522 . Google Scholar Crossref Search ADS WorldCat [3]. Nguyen , T.C. , Vo , D.H., Van Nguyen , P., Nguyen , H.M. and Vo , A.T. ( 2019 ) Derivatives market and economic growth nexus: Policy implications for emerging markets . N. Am. J. Econ. Financ . Google Scholar OpenURL Placeholder Text WorldCat [4]. Kara , Y. , Boyacioglu , M.A. and Baykan , Ö.K. ( 2011 ) Predicting direction of stock price index movement using artificial neural networks and support vector machines: the sample of the Istanbul Stock Exchange . Expert Syst. Appl. , 38 , 5311 – 5319 . Google Scholar Crossref Search ADS WorldCat [5]. Wang , D. and Zhang , H. ( 2013 ) Group AHP and K-means cluster for a new segmentation of brand customer . Int. J. Adv. Comput. Technol. , 5 , 213 . Google Scholar OpenURL Placeholder Text WorldCat [6]. Xia , J. ( 2002 ) Grey System Theory to Hydrology . Huazhong University of Science and Technology Press , Wuhan, China . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC [7]. Malkiel , B.G. ( 2007 ) A Random Walk downWall Street: The Time-Tested Strategy for Successful Investing . WW Norton & Company . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC [8]. Nguyen , H.M. and Khoa , B.T. ( 2019 ) The relationship between the perceived mental benefits, online trust, and personal information disclosure in online shopping . J. Asian Financ. Econ. Bus. , 6 , 261 – 270 . Google Scholar Crossref Search ADS WorldCat [9]. Singh , R. and Srivastava , S. ( 2017 ) Stock prediction using deep learning . Multimed. Tools Appl. , 76 , 18569 – 18584 . Google Scholar Crossref Search ADS WorldCat [10]. Born , J.A. and Acherqui , Y. ( 2015 ) A symmetric Super Bowl stock market predictor model . Financ. Mark. Portf. Manag. , 29 , 115 – 124 . Google Scholar Crossref Search ADS WorldCat [11]. Qian , B. and Rasheed , K. ( 2007 ) Stock market prediction with multiple classifiers . Appl. Intell. , 26 , 25 – 33 . Google Scholar Crossref Search ADS WorldCat [12]. Arul , V.H. , Sivakumar , V.G., Marimuthu , R. and Chakraborty , B. ( 2019 ) An approach for speech enhancement using deep convolutional neural network . Multimed. Res. , 2 , 37 – 44 . Google Scholar OpenURL Placeholder Text WorldCat [13]. Klausner , M. ( 1984 ) Sociological Theory and the Behavior of Financial Markets . In The Social Dynamics of Financial Markets , pp. 57 – 81 . [14]. Wu , D. , Fung , G.P.C., Yu , J.X. and Liu , Z. ( 2008 ) Integrating Multiple Data Sources for Stock Prediction . In Proc. WISE , pp. 77 – 89 . Springer Link. [15]. Wu , D. , Fung , G.P.C., Yu , J.X. and Pan , Q. ( 2009 ) Stock prediction: an event-driven approach based on bursty keywords . Front. Comput. Sci. China , 3 , 145 – 157 . Google Scholar Crossref Search ADS WorldCat [16]. Menaga , D. and Revathi , S. ( 2020 ) Deep Learning: A Recent Computing Platform for Multimedia Information Retrieval . In Deep Learning Techniques and Optimization Strategies in Big Data Analytics , pp. 124 – 141 . [17]. Chen , Y. and Abraham , A. ( 2016 ) Hybrid-Learning Methods for Stock Index Modeling . In Artificial Neural Networks in Finance and Manufacturing . [18]. Thomas , R. and Rangachar , M.J.S. ( 2018 ) Hybrid optimization based DBN for face recognition using low-resolution images . Multimed. Res. , 1 , 33 – 43 . Google Scholar OpenURL Placeholder Text WorldCat [19]. Li , G.-D. , Yamaguchi , D. and Nagai , M. ( 2008 ) The development of stock exchange simulation prediction modeling by a hybrid grey dynamic model . Int. J. Adv. Manuf. Technol. , 36 , 195 – 204 . Google Scholar Crossref Search ADS WorldCat [20]. Kim , K.-J. and Lee , W.B. ( 2004 ) Stock market prediction using artificial neural networks with optimal feature transformation . Neural Comput. Applic. , 13 , 255 – 260 . Google Scholar Crossref Search ADS WorldCat [21]. Basak , D. , Pal , S. and Patranabis , D.C. ( 2007 ) Support vector regression . Neural Inf. Process. Lett. Rev. , 11 , 203 – 224 . Google Scholar OpenURL Placeholder Text WorldCat [22]. Li , X. , Xie , H., Wang , R., Cai , Y., Cao , J., Wang , F., Min , H. and Deng , X. ( 2016 ) Empirical analysis: stock market prediction via extreme learning machine . Neural Comput. Applic. , 27 , 67 – 78 . Google Scholar Crossref Search ADS WorldCat [23]. Ouahilal , M. , Mohajir , M.E., Chahhou , M. and Mohajir , B.E.E. ( 2017 ) A novel hybrid model based on Hodrick–Prescott filter and support vector regression algorithm for optimizing stock market price prediction . J. Big Data ., 4 , 1 – 22 . Google Scholar Crossref Search ADS WorldCat [24]. Jadhav , P.P. and Joshi , S.D. ( 2018 ) ADF: adaptive dragonfly optimization algorithm enabled with the TDD properties for model transformation . Int. J. Database Theory Appl , 11 , 41 – 58 . Google Scholar OpenURL Placeholder Text WorldCat [25]. Ferreira , P. , Dionísio , A. and Movahed , S.M.S. ( 2017 ) Assessment of 48 stock markets using adaptive multifractal approach . Phys. A , 486 , 730 – 750 . Google Scholar Crossref Search ADS WorldCat [26]. Chen , W. , Hao , Z., Cai , R., Zhang , X., Hu , Y. and Liu , M. ( 2016 ) Multiple-cause discovery combined with structure learning for high-dimensional discrete data and application to stock prediction . Soft Comput. , 20 , 4575 – 4588 . Google Scholar Crossref Search ADS WorldCat [27]. Enke , D. , Grauer , M. and Mehdiyev , N. ( 2011 ) Stock market prediction with multiple regression, fuzzy type-2 clustering and neural networks . Procedia Comput. Sci. , 6 , 201 – 206 . Google Scholar Crossref Search ADS WorldCat [28]. Chong , E. , Han , C. and Park , F.C. ( 2017 ) Deep learning networks for stock market analysis and prediction: methodology, data representations, and case studies . Expert Syst. Appl. , 83 , 187 – 205 . Google Scholar Crossref Search ADS WorldCat [29]. Wang , F. , Zhang , Y., Li , Q.R.K. and Zhang , H. ( 2017 ) Exploring mutual information-based sentimental analysis with kernel-based extreme learning machine for stock prediction . Soft Comput. , 21 , 3193 – 3205 . Google Scholar Crossref Search ADS WorldCat [30]. Pehlivanli , A.C. , Asikgil , B. and Gulay , G. ( 2016 ) Indicator selection with committee decision of filter methods for stock market price trend in ISE . Appl. Soft Comput. , 49 , 792 – 800 . Google Scholar Crossref Search ADS WorldCat [31]. Gocken , M. , Ozcalici , M., Boru , A. and Dosdogru , A.T. ( 2016 ) Integrating metaheuristics and artificial neural networks for improved stock price prediction . Expert Syst. Appl. , 44 , 320 – 331 . Google Scholar Crossref Search ADS WorldCat [32]. Malagrino , L.S. , Roman , N.T. and Monteiro , A.M. ( 2018 ) Forecasting stock market index daily direction: a Bayesian netwok approach . Expert Syst. Appl. , 105 , 11 – 12 . Google Scholar Crossref Search ADS WorldCat [33]. Nelson , D.M.Q. , Pereira , A.C.M. and Oliveira , R.A. ( 2017 ) Stock Market’s Price Movement Prediction with LSTM Neural Networks . In Int. Joint Conf. Neural Networks (IJCNN) , Anchorage, IEEE, AK, USA . [34]. Zhanga , J. , Cui , S., Xu , Y., Li , Q. and Li , T. ( 2018 ) A novel data–driven stock price trend prediction system . Expert Syst. Appl. , 97 , 60 – 69 . Google Scholar Crossref Search ADS WorldCat [35]. Gheraibia , Y. and Moussaoui , A. ( 2013 ) Penguins Search Optimization Algorithm (PeSOA) . In Proceedings of the International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems (IEA/AIE), Recent Trends in Applied Artificial Intelligence , pp. 222 – 231 . Springer Link. [36]. Di , X. ( 2014 ) Stock Trend Prediction with Technical Indicators Using SVM . Stanford University . Google Scholar Google Preview OpenURL Placeholder Text WorldCat COPAC [37]. Lee , S.-J. , Xu , Z., Li , T. and Yang , Y. ( 2018 ) A novel bagging C4.5 algorithm based on wrapper feature selection for supporting wise clinical decision making . J. Biomed. Inform. , 78 , 144 – 155 . Google Scholar Crossref Search ADS PubMed WorldCat [38]. Menezes , J.M.P. and Barreto , G.A. ( 2008 ) Long-term time-series prediction with the NARX network: an empirical evaluation . Neurocomputing , 71 , 3335 – 3343 . Google Scholar Crossref Search ADS WorldCat [39]. Ranganathan , A. ( 2004 ) The Levenberg–Marquardt Algorithm . Tutor. LM Algorithm , 11 , 101 – 110 . Google Scholar OpenURL Placeholder Text WorldCat [40]. Stock Market Data . http://www.moneycontrol.com/stocks/histstock.php . ( accessed December 06, 2017 . [41]. Taneja , R. and Vaibhav ( 2018 ) Stock market prediction using regression . Int. Res. J. Eng. Technol. , 5 , 813 – 815 . Google Scholar OpenURL Placeholder Text WorldCat [42]. Zhu , C. , Yin , J. and Li , Q. ( 2014 ) A stock decision support system based on DBNs . J. Comput. Inf. Syst. , 10 , 883 – 893 . Google Scholar OpenURL Placeholder Text WorldCat [43]. Atsalakis , G.S. and Valavanis , K.P. ( 2009 ) Forecasting stock market short-term trends using a neuro-fuzzy based methodology . Expert Syst. Appl. , 36 , 10696 – 10707 . Google Scholar Crossref Search ADS WorldCat © The British Computer Society 2020. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) TI - Wrapper-Enabled Feature Selection and CPLM-Based NARX Model for Stock Market Prediction JF - The Computer Journal DO - 10.1093/comjnl/bxaa099 DA - 2021-02-19 UR - https://www.deepdyve.com/lp/oxford-university-press/wrapper-enabled-feature-selection-and-cplm-based-narx-model-for-stock-5gCfCG0vlu SP - 169 EP - 184 VL - 64 IS - 2 DP - DeepDyve ER -