TY - JOUR AU - J P,, Ananth AB - Abstract Rainfall prediction is the active area of research as it enables the farmers to move with the effective decision-making regarding agriculture in both cultivation and irrigation. The existing prediction models are scary as the prediction of rainfall depended on three major factors including the humidity, rainfall and rainfall recorded in the previous years, which resulted in huge time consumption and leveraged huge computational efforts associated with the analysis. Thus, this paper introduces the rainfall prediction model based on the deep learning network, convolutional long short-term memory (convLSTM) system, which promises a prediction based on the spatial-temporal patterns. The weights of the convLSTM are tuned optimally using the proposed Salp-stochastic gradient descent algorithm (S-SGD), which is the integration of Salp swarm algorithm (SSA) in the stochastic gradient descent (SGD) algorithm in order to facilitate the global optimal tuning of the weights and to assure a better prediction accuracy. On the other hand, the proposed deep learning framework is built in the MapReduce framework that enables the effective handling of the big data. The analysis using the rainfall prediction database reveals that the proposed model acquired the minimal mean square error (MSE) and percentage root mean square difference (PRD) of 0.001 and 0.0021. 1. INTRODUCTION Agriculture, a wider source in the economic sector, contributes much towards the socio-economic development of India. Agriculture is a unique business crop production, which depends on factors like climate and economy [1]. The production of crops is enhanced using advanced technologies in agriculture. It provides environmental information, and it assists in system monitoring. Historical information regarding crop yield is significant for the operation of industries, which utilize products of agriculture as raw material, food, livestock, chemicals, animal feed, fertilizer, poultry, seed, pesticides and paper. Accurate estimation of crop production assists these companies to plan the decision regarding the supply chain similar to production scheduling [1]. It is not enough to monitor the environmental factors as there is a need for a complete solution for enhancing the crop yield. Various aspects affect productivity at a very high level [2]. Agriculture depends on few factors, such as soil, cultivation, climate, irrigation, temperature, fertilizers, harvesting, pesticide weeds and rainfall [1]. The growth of the crops has a huge impact on agriculture at every stage of crop growth. At more times, the effective rainfall required for the crop production is not enough so that the growth becomes a critic challenge and hence it is required to support the farmers to estimate the amount of water required for irrigation [3]. Rainfall forms the significant component in the hydrological cycle so that the modeling and forecasting are essential for managing the water resources, scheduling the irrigation, managing the agriculture and operating the reservoir. There are many models, like soft computing, physical and stochastic-based time series models to model and forecast the hydrological variable. Soft computing strategies include artificial neural network (ANN) [4] and fuzzy strategies in the domains of water resource engineering associated with groundwater, drainage, water quality, drought water availability and rainfall prediction [5]. During the prediction of rainfall, month defines the beginning, interval and termination of the rainy season. The rainfall values in months assure the accurate intra-year distribution of rainfall distribution than giving the values of the seasonal rainfall. Monthly rainfall plays a major role in agricultural and hydrological roles so that the accurate prediction contributes a major role in enhancing the quality of the decision. The models used for rainfall prediction [6, 7, 8] are grouped as two: physical and data-driven models. Physical models follow the physical laws for modeling the related physical processes contributing the rainfall process. Data-driven models utilize historical data for future predictions [9]. Rainfall, a significant key in the hydrological cycle, meets the water demands. For executing an effective plan, there is a need for investigating the future, which cannot be predicted easily due to the complex atmospheric processes [10]. It is hectic to design the seasonal prediction model. Predicting the rainfall on a regional scale may cause huge time while planning the hydrological and agriculture needs of the society [11]. At present, the traditional database paradigm fails to yield sufficient storage to the data from the Internet of Things (IoT) devices arousing the need for cloud storage. Big data mining approaches contribute much towards the data analytics, and the analytics based on the Cloud and IoT technology gains powerful goal in meeting the feasibility of smart agriculture. Smart and precision systems in agriculture play a viable role to enhance agriculture activities. Big data, from multiple sources, include social networking data, sensor data and business data. Big data regarding the agriculture field is employed for managing the supply chain related to the agro products in order to reduce the cost of production [2]. Big data analytics, a significant technology, adds numerous credits to society and organizations. The size of the big data size varies between the data in terabytes and petabytes, which could be managed using Hadoop MapReduce. The big data analytics supports discovering the patterns and trends using the data [12]. Data mining approaches, like classification, gain knowledge regarding the water needed to irrigation that takes a long way to improve the agricultural sector so that the farmers are supported with the advanced information regarding the amount of water to be saved for irrigation in the absence of monsoon rains [3]. Big data analytics utilize predictive analytics for assessing future possibilities regarding data modeling. There are a number of techniques, such as classification, clustering and association rule mining algorithms for analyzing and predicting the data [2]. The primary intention of the research is to design and develop a technique for predicting the rainfall using the MapReduce framework and deep learning. Initially, the input data is given to the proposed S-SGD-based convLSTM, in which the optimal weights are selected by the S-SGD algorithm, which is designed through the hybridization of the Salp swarm algorithm (SSA) and stochastic gradient descent (SGD) algorithm. The proposed prediction model is applied to the MapReduce framework to handle big data issues for the effective prediction of rainfall. 1.1. The contributions of the paper 1.1.1. S-SGD-based convLSTM The main contribution of this paper is the development of the S-SGD-based convLSTM model for predicting the rainfall. The proposed model handles time-series data effectively. 1.1.2. S-SGD algorithm It is developed by integrating the SSA and SGD algorithms for selecting the optimal weights of the convLSTM. The modification of SGD with SSA enhances the optimal global convergence tendency of the algorithm. The rest of the paper is organized as follows: Section 2 discusses the literature review of the rainfall prediction methods, and Section 3 presents the proposed algorithm based convLSTM. Section 4 depicts the MapReduce framework for progressing the prediction. The results of the proposed method are discussed in Section 5, and finally, Section 6 concludes the paper. 2. LITERATURE SURVEY This section deliberates the review of the existing methods in the literature. Accordingly, eight papers are reviewed and analyzed to present the advantages and disadvantages of the methods. Beheshti et al. [10] developed a method termed as centripetal accelerated particle swarm optimization (CAPSO) that exhibited greater forecasting accuracy, and there was no need for the tuning of the algorithmic parameter. The demerit of the method was regarding the selection of the nodes for the individual layers, assigning the weights of the layers, and designing the error function were the major drawbacks of the method. Wahyuni et al. [13] modeled a method termed as adaptive neuro-fuzzy inference system-genetic algorithm (ANFIS-GA) that was reliable as they were equipped with the network to learn. The method failed to detect the growing season of the crops. Mohan [14] developed a weighted-self organizing map scheme to predict the rainfall accurately, but the prediction rate needs to be enhanced. Hussain et al. [15] modeled a method termed as dynamic self-organized multilayer neural network inspired by the immune algorithm (DSMIA). The method was capable of dealing with the clustering mechanism through the modification of the propagation algorithm. The demerit of the method was regarding the need for global optimization. Bagirov et al. [9] used the cluster wise linear regression approach that discovered the trends from the data. The demerit of the method was regarding the accurate prediction of rainfall. Vathsala and Koolagudi [16] developed a data mining and statistical method that was highly accurate through an effective dimensional reduction. The drawback of the method was regarding the prediction of different homogeneous rainfall regions and with testing of the huge volumes of data. Manogaran and Lopez [17] developed a method, termed as spatial cumulative sum algorithm that could be capable of monitoring the seasonal changes. However, the method failed to consider the prediction of seasonal changes. Singh [18] developed a model termed as fuzzy time series and neural network-based model that could effectively deal with the expert decision-making and rendered the ability for predicting the non-linear behavior of the summer monsoon. The drawback of the method was that the accuracy rate at the specific intervals of the time series was poor. Nair et al. [19] utilized ANN for predicting the monthly summer monsoon rainfall using global climate models. To avoid the over-fitting during the training process of the ANN, this work utilized a double cross-validation and simple randomization approach. This ANN model simulated the correct signs of rainfall anomalies over various spatial points of the Indian domain. Chatterjee et al. [20] developed a hybrid neural network (HNN)-based model for rainfall prediction in the Southern part of the state West Bengal of India. This model was highly efficient in predicting rainfall. 3. Salp-STOCHASTIC GRADIENT DESCENT ALGORITHM FOR TRAINING THE convLSTM NETWORKS This section illustrates the architecture of the convLSTM networks, along with the training steps. convLSTM, one among the recurrent neural network (RNN), possesses the feedback linked with some layers in the network. Not like the conventional RNN, convLSTM suits well in predicting the time-series with the presence of the time steps with a defined arbitrary size. Additionally, the problem associated with the gradient vanishing is resolved using the memory unit that sustains the time-related information for a particular time. Since it is known that the prediction performance using convLSTM is enhanced compared with the conventional RNN and the time-series data is effectively handled using the convLSTM networks, convLSTM is employed for rainfall prediction. 3.1. Architecture of convLSTM convLSTM holds numerous advantages compared with the fully connected LSTM (FC-LSTM), which necessarily requires unfolding the input data as 1D vector before processing due to which the spatial information is lost. Unlike FC-LSTM, convLSTM is organized as 3D tensors of inputs |${I}_1,\dots, {I}_{\tau }$|⁠, cell outputs |${O}_1,\dots, {O}_{\tau }$|⁠, hidden states |${H}_1,\dots, {H}_{\tau }$|⁠, gates |${P}_{\tau }$|⁠, |${Q}_{\tau }$| and |${R}_{\tau }$| of convLSTM. Spatial information is sustained in convLSTM that is a highly expected advantage over the FC-LSTM. Thus, convLSTM [21] predicts the future using the inputs and the past states corresponding to its neighbors, which is facilitated using the convolutional operator |$\ast$| and Hadamard product |$\circ$|⁠. The Hadamard product is employed to ensure the constant error carousel (CEC) nature of cells. convLSTM yields effective feature patterns when a large transitional kernel is employed, and for the effective prediction, the structure of convLSTM is organized as the encoding and forecasting layers. The initial states and cell outputs of the final stage of the encoding network are copied to the forecasting network. The prediction output takes the same dimension of the input, and hence, all the states in the forecast network are combined together and fed to the |$\Big[1\times 1\Big]$| conv layer to do the final prediction. Figure 1 depicts the architecture of convLSTM networks distinguishing the encoding and the forecast layers. Architecture of convLSTM. FIGURE 1. Open in new tabDownload slide FIGURE 1. Open in new tabDownload slide The encoding LSTM compresses the input into the hidden states to forecast the final rainfall prediction. Each of the LSTM networks consists of memory units, which consists of gates and cell as is shown in Fig. 2, and the calculation of the output from the individual components of the memory unit is demonstrated. The memory cell and the gates play a major role in controlling the information flow; in addition, the encoding-forecasting structure of convLSTM [22] assures the forecast based on the spatiotemporal sequences. The memory cell and the three gates replace the ordinal neuron in the other deep learning architectures so that the memory cell state of convLSTM is updated based on the new input, forgets the unnecessary contents and outputs the result. The abovementioned features of the convLSTM make it suitable for the prediction of the time-series data. The calculation of the input gates and the output from the memory unit is explained below: Memory unit of convLSTM. FIGURE 2. Open in new tabDownload slide FIGURE 2. Open in new tabDownload slide convLSTM [23] differs from the other deep learning networks in the way of utilizing the feedback loops that hold the memory of the past. The output from the input gate is formulated as $$\begin{equation} {P}_{\tau }=\sigma\;\left({\omega}_P^I\ast{I}_{\tau }+{\omega}_P^H\ast{H}_{\tau -1}+{\omega}_P^O\circ{O}_{\tau -1}+{\beta}^P\right) \end{equation}$$ (1) where |${I}_{\tau }$| is the input vector and |${\omega}_P^I$| refers to the weight between the input layer and the input gate. |$\sigma$| refers to the gate activation function, which is most often the sigmoid function. |${\omega}_P^H$| signifies the weight between the input layer and the memory output, and |${\omega}_P^O$| refers to the weight vector between the input layer and the cell output. |${H}_{\tau -1}$| and |${O}_{\tau -1}$| are the previous output of the cell and memory unit. The bias of the input layer is denoted as |${\beta}^P$|⁠. The convolutional operator is denoted as |$\ast$|⁠, and |$\circ$| denotes the element-wise multiplication. The result from the Forget gate is computed as $$\begin{equation} {Q}_{\tau }=\sigma \left({\omega}_Q^I\ast{I}_{\tau }+{\omega}_Q^H\ast{H}_{\tau -1}+{\omega}_Q^O\circ{O}_{\tau -1}+{\beta}^Q\right) \end{equation}$$ (2) where |${\omega}_Q^I$| refers to the weight between the input layer and forget gate, |${\omega}_Q^H$| is the weight that connects the output gate and the memory unit of the previous layer and |${\omega}_Q^O$| symbolizes the weight between the output gate and the cell. |${\beta}^Q$| denotes the bias corresponding to the forget gate. The output from the output gate is formulated as $$\begin{equation} {R}_{\tau }=\sigma\;\left({\omega}_R^I\ast{I}_{\tau }+{\omega}_R^H\ast{H}_{\tau -1}+{\omega}_R^O\circ{O}_{\tau }+{\beta}^R\right) \end{equation}$$ (3) where |${\omega}_R^I$| denotes the weight that links the output gate and the input layer, |${\omega}_R^H$| symbolizes the weight between the output gate and the memory unit and |${\omega}_R^O$| highlights the weight between the output gate and the cell. Let the bias of the output gate be denoted as |${\beta}^R$|⁠. The output of the temporary cell state is formulated based on the activation function of the weights corresponding to the cell as given as $$\begin{equation} \overset{\sim }{O_{\tau }}=\tanh\;\left({\omega}_C^I\ast{I}_{\tau }+{\omega}_C^H\ast{H}_{\tau -1}+{\beta}^C\right) \end{equation}$$ (4) where |${\omega}_C^I$| and |${\omega}_C^H$| are the weights between the cell and the input layer, and cell and memory unit, respectively. The bias of the cell is represented as |${\beta}^C$|⁠. The output of the cell is given as the sum of the temporary cell state, and the difference between the memory unit of the previous and the current layers is given as $$\begin{equation} {O}_{\tau }={Q}_{\tau}\circ{O}_{\tau -1}+{P}_{\tau}\circ \overset{\sim }{O_{\tau }} \end{equation}$$ (5) $$\begin{equation} {O}_{\tau }={Q}_{\tau}\circ{O}_{\tau -1}+{P}_{\tau}\circ \tanh\;\left({\omega}_C^I\ast{I}_{\tau }+{\omega}_C^H\ast{H}_{\tau -1}+{\beta}^C\right) \end{equation}$$ (6) The output from the memory unit is given as $$\begin{equation} {H}_{\tau }={R}_{\tau}\circ \tanh\;\left({O}_{\tau}\right) \end{equation}$$ (7) where |${H}_{\tau }$| indicates the output of the memory block and |${R}_{\tau }$| refers to the output gate. Thus, the output of the output layer is given as $$\begin{equation} {Z}_{\tau }=\varphi\;\left({\omega}_Z^H.{H}_{\tau }+{\beta}^Z\right) \end{equation}$$ (8) where |${Z}_{\tau }$| is the output vector and |${\omega}_Z^H$| is the weight between the output vector and memory unit. The bias of the output layer is denoted as |${\beta}^Z$|⁠. Thus, it is clear from Fig. 1 that the output from the output layer of each LSTM is forwarded to the succeeding LSTM, and finally, the copies of the output from LSTMs in the encoding layer is fed to the forecasting layer for final prediction. Thus, the weights and the bias of convLSTM is given as |$\omega \in \Big\{{\omega}_Z^H,{\omega}_C^I,{\omega}_C^H,{\omega}_R^I,{\omega}_R^H,{\omega}_R^O,{\omega}_Q^I,{\omega}_Q^H,{\omega}_Q^O,{\omega}_P^I,{\omega}_P^H,{\omega}_P^O\Big\}$| and |$\beta \in \Big\{{\beta}^C,{\beta}^Q,{\beta}^R,{\beta}^P\Big\}$|⁠. The tuning of the weights |$\omega$| and biases |$\beta$| of the convLSTM is determined optimally using the proposed S-SGD algorithm. 3.2. Training algorithm for convLSTM The optimization algorithm used to train the weights of the convLSTM [24] is S-SGD algorithm, which is the modified SGD algorithm [25] using the SSA [26]. The SGD algorithm is the stochastic approximation of the gradient descent algorithm that aims at minimizing the objective function. The advantages of the SGD algorithm are that the computing time is less as the algorithm does not make use of the entire training set and they possess the tendency to dynamically adjust the estimation. The algorithm possesses a higher tendency to local optimal avoidance, minimizes the computing cost and converges quickly. The modification of SGD with SSA enhances the optimal global convergence tendency of the algorithm. SSA is based on the biological behavior of the salps in such a way that the follower salp follows the leader. The position of the leader salp is updated based on position of the food source, and the individual salps update their position based on the position of the leader. The SSA solves the single-objective optimization problems with unknown search spaces. The adaptive mechanism of SSA allows this algorithm to avoid local solutions and ultimately finds an accurate estimation of the best solution obtained during optimization. Hence, it can be applied to both unimodal and multimodal problems. These advantages make SSA outperform recent algorithms, such as grey wolf optimizer (GWO) and artificial bee colony (ABC). The algorithmic steps of the proposed S-SGD algorithm are demonstrated as follows: Step 1: Initialization: in the first step, the weights are initialized that represent the solution vector, |$u(\tau)$|⁠. The solution of the algorithm at a time |$\tau$| is indicated as |${u}_v(\tau)$|⁠;|$(1\le v\le q)$|⁠. Step 2: Evaluation of the objective function: the objective function is determined to conclude the optimal solution to train the classifier. The objective function is given as $$\begin{equation} \operatorname{Min}\;(u)=\frac{\kappa }{2}\;{\left\Vert u\right\Vert}^2+ fn\;\left({x}_{i_{\tau }}.{y}_{i_{\tau }}\right) \end{equation}$$ (9) The optimal solution is chosen based on the minimum value of the objective function. Step 3: Weight update using S-SGD algorithm: the weights are updated based on the proposed algorithm that is determined through modifying the standard SGD algorithm with the SSA. The standard equation of SGD is given as $$\begin{equation} {u}_v\left(\tau +1\right)=\left(1-\frac{1}{\tau}\right)\;{u}_v\left(\tau \right)+{x}_{i_{\tau }}.{y}_{i_{\tau }} \end{equation}$$ (10) $$\begin{equation} {u}_v\left(\tau +1\right)=\left(\frac{\tau -1}{\tau}\right)\;{u}_v\left(\tau \right)+{x}_{i_{\tau }}.{y}_{i_{\tau }} \end{equation}$$ (11) $$\begin{equation} {u}_v\left(\tau \right)=\left(\frac{\tau }{\tau -1}\right)\;\left[{u}_v\left(\tau +1\right)-{x}_{i_{\tau }}.{y}_{i_{\tau }}\right] \end{equation}$$ (12) where |${x}_{i_{\tau }}$| refers to the feature vector and |${y}_{i_{\tau }}$| indicates the category of the |$i\mathrm{th}$| training sample. |${u}_v(\tau +1)$| and |${u}_v(\tau)$| symbolizes the weights in iteration |$(\tau +1)$| and |$\tau$|⁠. The standard update equation is computed based on the weights in the previous iteration and based on the training input. |${i}_{\tau }$| refers to the target of the chosen sample for training at iteration |$\tau$|⁠, and |${x}_{i_{\tau }}$| and |${y}_{i_{\tau }}$| denote the category of the |$i\mathrm{th}$| training sample. Thus, |$\Big({x}_{i_{\tau }}.{y}_{i_{\tau }}\Big)$| refers to the randomly chosen training sample among the entire training data. The update equation of the SSA is given as $$\begin{equation} {u}_v\left(\tau +1\right)=\frac{1}{2}\;\left[{u}_v\left(\tau \right)+{u}_{v-1}\left(\tau \right)\right] \end{equation}$$ (13) where |${u}_v(\tau +1)$| indicates the position of the |$v\mathrm{th}$| salp at the iteration |$(\tau +1)$| and |$\tau$| refers to the iteration. The modification of SGD is exhibited through substituting the SSA update equation in SGD update equation as $$\begin{equation} \small{u}_v\left(\tau +1\right)=\frac{1}{2}\;\left[\!\left(\frac{\tau }{\tau -1}\right)\;\left[{u}_v\left(\tau +1\right)-{x}_{i_{\tau }}.{y}_{i_{\tau }}\right]+{u}_{v-1}\left(\tau \right)\!\right] \end{equation}$$ (14) $$\begin{align} \small{u}_v\left(\tau +1\right)&=\frac{1}{2}\times \left(\frac{\tau }{\tau -1}\right)\times{u}_v\left(\tau +1\right) \nonumber \\ \small &-\frac{1}{2}\times \left(\frac{\tau }{\tau -1}\right)\times{x}_{i_{\tau }}.{y}_{i_{\tau }}+\frac{1}{2}\times{u}_{v-1}\left(\tau \right) \end{align}$$ (15) $$\begin{align} \small{u}_v\left(\tau +1\right)&-\frac{1}{2}\times \left(\frac{\tau }{\tau -1}\right)\times{u}_v\left(\tau +1\right) \nonumber\\ \small &=-\frac{1}{2}\times \left(\frac{\tau }{\tau -1}\right)\times{x}_{i_{\tau }}.{y}_{i_{\tau }} + \frac{1}{2}\times{u}_{v-1}\left(\tau \right) \end{align}$$ (16) $$\begin{equation} \small{u}_v\left(\tau +1\right)\left\{1-\frac{1}{2}\times \left(\!\frac{\tau }{\tau -1}\!\right)\right\}=\frac{1}{2}\times{u}_{v-1}\left(\tau \right)-\frac{1}{2}\times \left(\!\frac{\tau }{\tau -1}\!\right)\times{x}_{i_{\tau }}.{y}_{i_{\tau }} \end{equation}$$ (17) $$\begin{equation} \small{u}_v\left(\tau +1\right)\;\left(\frac{\tau -2}{2\left(\tau -1\right)}\right)=\frac{1}{2}\times{u}_{v-1}\left(\tau \right)-\frac{1}{2}\times \left(\frac{\tau }{\tau -1}\right)\times{x}_{i_{\tau }}.{y}_{i_{\tau }} \end{equation}$$ (18) $$\begin{equation} \small{u}_v\left(\tau +1\right)\\ =\frac{2\left(\tau -1\right)}{\tau -2}\left\{\frac{1}{2} \times{u}_{v-1}\left(\tau \right)-\frac{1}{2}\times \left(\frac{\tau }{\tau -1}\!\right)\times{x}_{i_{\tau }}.{y}_{i_{\tau }}\right\} \end{equation}$$ (19) where |${u}_v(\tau +1)$| refers to the newly updated weight, |$\tau$| indicates the current iteration, |${u}_{v-1}(\tau )$| indicates the weight in the previous iteration, and |$({x}_{i_{\tau }}.{y}_{i_{\tau }})$| refers to the training sample. The above equation is used for formulating the optimal weights to train the classifier. It is well clear from Equation (19) that the selection of the best solution is based on the minimum value of the objective function and the generation of the best solution is obtained based on the solution in the previous iteration, and the training sample. Step 4: Termination: the steps from 2 to 3 are repeated for the maximum number of iterations. Algorithm 1 shows the pseudo-code of the proposed S-SGD algorithm that aims at deriving the optimal weights to train the convLSTM for the effective rainfall prediction. Open in new tabDownload slide Open in new tabDownload slide 4. PROPOSED RAINFALL PREDICTION MODEL USING THE DEEP LEARNING-BASED MapReduce FRAMEWORK Rainfall prediction plays a dominant role in planning the agriculture, and the related activities that remains as the backbone of the Indian economy. There are a lot of models that predict the rainfall using the weather data, but most of the methods failed in dealing with the big data. The weather data is the time series data and takes the form of the big data, insisting the need for the effective method to deal with the prediction. The MapReduce framework solves the issues associated with the big data as it is capable of parallel computing, free from computational complexity. Initially, the input weather data is fed to the MapReduce framework for the prediction that functions based on the map and reduce functions, S-SGD-based ConvLSTM networks. The proposed S-SGD algorithm, which is the modification of the SGD algorithm with the SSA, trains the weights of convLSTM. The outputs from the individual mappers that are trained with various delays are concatenated to form the input to the reducer, using which the prediction is attained. The proposed model to predict the rainfall is shown in Fig. 3. Block diagram of the rainfall prediction model. FIGURE 3. Open in new tabDownload slide FIGURE 3. Open in new tabDownload slide Let us assume the input weather data as, |$I$|⁠, used for rainfall prediction using the MapReduce framework. The structure of the MapReduce framework is depicted below: 4.1. MapReduce framework for rainfall prediction MapReduce gains significance because of the ability to deal with the huge amount of data in parallel and a highly reliable manner. The MapReduce framework improves visualization and provides an effective prediction environment with a highly scalable environment. The MapReduce framework executes the ability to store and distribute huge sets of data through plenty of servers that inhibit the high processing power of the framework. Additionally, the MapReduce framework operates on the multiple sources of the data irrespective of whether they are structured or unstructured. The weather data arrives from the distributed resources is fed to the individual mappers to train convLSTM present in it. The MapReduce programming consists of two functions, such as mapper and reducer functions engaged in mapping the input data. The map and reduce functions of the MapReduce framework utilizes the S-SGD algorithm-based convLSTM. The mapper module consists of a number of mappers to process the input weather data, and the intermediate data formed through combining the output of the mappers is used to train the reducer, which provides the final prediction. (i) Mapper phase: the mappers in the mapper module operate based on the map function based on the convLSTM that is tuned optimally using the proposed S-SGD algorithm. Let us assume that there are a total of |$m$| mappers, which is represented as $$\begin{equation} M=\left\{{M}_1,{M}_2,\dots, {M}_j,\dots, {M}_m\right\} \end{equation}$$ (20) where |${M}_j$| refers to the |$j\mathrm{th}$| mapper. The individual mappers with convLSTM are trained using the input weather data using various delays. For instance, the convLSTM in mapper-1 is trained using the data |$I(l-[L+1])$| so that the predicted output from the mapper 1 is denoted as |$I\;(l+a)$|⁠. Likewise, mapper 2 is trained using the weather data |$I(l-[L+2])$| to yield the predicted output, |$I\;(l+a-1)$|⁠. The |$m\mathrm{th}$| mapper is trained using the weather data |$I(l)$| to yield the predicted output |$I\;(l+1)$|⁠. In other words, one can say that the individual mapper performs the prediction using the past records with a delay |$[L+1]$|⁠, |$[L+2]$| and so on. The predicted output from the mappers is given as $$\begin{equation} \small \overset{\sim }{I_{l+1}},\dots, \overset{\sim }{I_{l+a}}=\underset{I_{\tau +1},\dots, {I}_{\tau +a}}{\arg \max}\;\rho\;\left(\left.{I}_{l+1},\dots, {I}_{l+a}\right|\overset{\sim }{I_{l-L+1}},\overset{\sim }{I_{l-L+2}},\dots, \overset{\sim }{I_l}\right) \end{equation}$$ (21) $$\begin{equation} \tiny \overset{\sim }{I_{l+1}},\dots, \overset{\sim }{I_{l+a}}\approx \underset{I_{l+1},\dots, {I}_{l+a}}{\arg \max}\ \rho \ \left(\left.{I}_{l+1},\dots, {I}_{l+a}\right|{Q}_{encode}\left(\overset{\sim }{I_{l-L+1}},\overset{\sim }{I_{l-L+2}},\dots, \overset{\sim }{I_l}\right)\right) \end{equation}$$ (22) $$\begin{equation} \small \overset{\sim }{I_{l+1}},\dots, \overset{\sim }{I_{l+a}}\approx{\upsilon}_{forecast}\;\rho\;\left({Q}_{encode}\left(\overset{\sim }{I_{l-L+1}},\overset{\sim }{I_{l-L+2}},\dots, \overset{\sim }{I_l}\right)\right) \end{equation}$$ (23) The output from the mappers is concatenated to train the reducers that are inbuilt with the ConvLSTM. (ii) Reducer phase: the reducer uses the reduce function, convLSTM, to perform the final rainfall prediction. In this module, convLSTM is tuned optimally using the proposed S-SGD algorithm. The data input to the reducer is given as $$\begin{equation} {I}^{\operatorname{int} er}=\left\{I\;\left(l+a\right),I\;\left(l+a-1\right),\dots, I\;\left(l+1\right)\right\} \end{equation}$$ (24) The reducer is trained using the intermediate data, |${I}^{\operatorname{int} er}$|⁠, to perform the final prediction. The predicted output is denoted as |$I(l+1)$|⁠. Testing phase: during the testing step, the test data is fed to the mappers with various data delays to obtain the concatenated output in order to proceed with the final prediction in the reducer. The final prediction from the reducer of the MapReduce framework enables the accurate generation of the prediction results. 5. RESULTS AND DISCUSSION The section discusses the effectiveness of the prediction models by analyzing the data obtained from the Rainfall Prediction dataset. The metrics employed for the analysis reveals the effective prediction model through the comparative discussion. 5.1. Experimental setup The experimentation analysis is performed in MATLAB, and the analysis is progressed using the database taken from the Rainfall Prediction dataset [27], which contains the weather data of the rainfall predicted in entire India and the state, Tamil Nadu, from the years 1901 to 2015. These two databases are split to form three datasets that provide the information of the rainfall data month-wise, year-wise and on a quarterly basis. 5.1.1. Dataset 1 Dataset 1 is taken from the Indian database that carries the rainfall data, and the dataset contains the rainfall data based on the month-wise series. 5.1.2. Dataset 2 Dataset 2 is obtained from the Indian database that carries the rainfall data, and the dataset contains the rainfall data based on the year-wise rainfall. 5.1.3. Dataset 3 Dataset 3 is taken from the Indian database that carries the rainfall data, and the dataset contains the rainfall data based on the quarterly term. 5.1.4. Dataset 4 This dataset is taken from the Tamil Nadu database that carries the rainfall data, and the dataset contains the rainfall data based on the month-wise series. 5.1.5. Dataset 5 This dataset is taken from the Tamil Nadu database that carries the rainfall data, and the dataset contains the rainfall data based on the year-wise series. 5.1.6. Dataset 6 This dataset is taken from the Tamil Nadu database that carries the rainfall data, and the dataset contains the rainfall data based on the quarterly term. The analysis is progressed using these six datasets, and the results are compared based on the performance metrics. 5.2. Performance metrics The metrics employed for the analysis include the percentage root mean square difference (PRD) and mean square error (MSE). The MSE of the method is computed as the mean square difference between the estimated output and the target output. The effective method reports with the minimum value of MSE and the computation of MSE is performed as $$\begin{equation} \mathrm{MSE}=\frac{1}{b}\;\sum \limits_{\upsilon =1}^b{\left({X}_{\upsilon }-X\right)}^2 \end{equation}$$ (25) where |$b$| refers to the total number of the samples, |${X}_{\upsilon }$| symbolizes the |${\upsilon}^{th}$| estimated output, and |$X$| signifies the target output. The formula for PRD [28] is given as $$\begin{equation} \mathrm{PRD}=\sqrt{\frac{\sum \limits_{\upsilon =1}^b{\left({X}_{\upsilon }-X\right)}^2}{\sum \limits_{\upsilon =1}^b{\left({X}_{\upsilon}\right)}^2}}\times 100 \end{equation}$$ (26) PRD is simple and is capable of evaluating the reliability of the method in establishing an accurate output. 5.3. Comparative methods The methods taken for the comparison includes the convolutional long-short term memory (convLSTM), clusterwise linear regression (CLR) technique [9], multi-layer perception (MLP) classification algorithm [15] and dynamic self-organizing multilayer network inspired by the immune algorithm (DSMIA) [16] that are compared with the proposed S-SGD-based convLSTM to prove the effectiveness of the proposed prediction model. Analysis using dataset 1: (a) MSE and (b) PRD. FIGURE 4. Open in new tabDownload slide FIGURE 4. Open in new tabDownload slide Analysis using dataset 2: (a) MSE and (b) PRD. FIGURE 5. Open in new tabDownload slide FIGURE 5. Open in new tabDownload slide 5.4. Comparative discussion This section demonstrates the analysis using six datasets that is progressed in terms of the performance metrics. 5.4.1. Using dataset 1 The analysis using dataset 1 is demonstrated in Fig. 4. Figure 4a depicts the analysis based on the MSE. The MSE of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA at the training percentage of 0.4 is 0.009, 0.02, 0.009, 0.02 and 0.01, respectively. The training percentage is increased further, and at the training percentage of 0.85, the MSE of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA is 0.004, 0.018, 0.008, 0.018 and 0.0043, respectively. Figure 4b shows the analysis based on the PRD of the methods. The PRD of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA at the training percentage of 0.4 is 1.173, 9.8019, 5.331, 12.6389 and 5.527, respectively. The training percentage is increased further, and at the training percentage of 0.85, the PRD of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA is 0.864, 9.3657, 5.321, 12.633 and 3.2658, respectively. The analysis reveals that the proposed method acquired better values of PRD and MSE when compared with the existing comparative methods, like convLSTM, CLR, MLP algorithm and DSMIA. Analysis using dataset 3: (a) MSE and (b) PRD. FIGURE 6. Open in new tabDownload slide FIGURE 6. Open in new tabDownload slide Analysis using dataset 4: (a) MSE and (b) PRD. FIGURE 7. Open in new tabDownload slide FIGURE 7. Open in new tabDownload slide 5.5. Using dataset 2 The analysis using dataset 2 is demonstrated in Fig. 5. Figure 5a depicts the analysis based on MSE. The MSE of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA at the training percentage of 0.4 is 0.014, 1, 0.014, 0.796 and 0.057, respectively. The training percentage is increased further, and at the training percentage of 0.85, the MSE of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA is 0.0135, 0.8695, 0.0135, 0.7249 and 0.0221, respectively. Figure 5b shows the analysis based on the PRD of the methods. The PRD of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA at the training percentage of 0.4 is 0.8814, 17.1498, 0.9360, 12.7569 and 1.7645, respectively. The training percentage is increased further, and at the training percentage of 0.85, the PRD of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA is 0.9354, 16.4342, 0.9354, 12.7558 and 1.1752, respectively. 5.5.1. Using dataset 3 The analysis using dataset 3 is demonstrated in Fig. 6. Figure 6a depicts the analysis based on the MSE. The MSE of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA at the training percentage of 0.4 is 0.0955, 0.1165, 0.1078, 0.0955 and 1, respectively. The training percentage is increased further, and at the training percentage of 0.85, the MSE of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA is 0.0867, 0.1049, 0.099, 0.0867 and 0.2795, respectively. Figure 6b shows the analysis based on the PRD of the methods. The PRD of the methods, S-SGD-based ConvLSTM, convLSTM, CLR, MLP algorithm and DSMIA at the training percentage of 0.4 is 5.9631, 12.9031, 10.9715, 12.7409 and 17.87, respectively. The training percentage is increased further, and at the training percentage of 0.85, the PRD of the methods, S-SGD-based convLSTM, ConvLSTM, CLR, MLP algorithm and DSMIA is 3.2922, 12.5674, 10.773, 12.7393 and 11.1557, respectively. Analysis using dataset 5: (a) MSE and (b) PRD. FIGURE 8. Open in new tabDownload slide FIGURE 8. Open in new tabDownload slide Analysis using dataset 6: (a) MSE and (b) PRD. FIGURE 9. Open in new tabDownload slide FIGURE 9. Open in new tabDownload slide Comparative analysis. TABLE 1. Comparative analysis. Metrics S-SGD-based convLSTM ConvLSTM CLR MLP algorithm DSMIA MSE 0.001 0.0014 0.0011 0.0015 0.0021 PRD 0.0021 0.1159 0.0049 0.1976 0.0045 Metrics S-SGD-based convLSTM ConvLSTM CLR MLP algorithm DSMIA MSE 0.001 0.0014 0.0011 0.0015 0.0021 PRD 0.0021 0.1159 0.0049 0.1976 0.0045 Open in new tab TABLE 1. Comparative analysis. Metrics S-SGD-based convLSTM ConvLSTM CLR MLP algorithm DSMIA MSE 0.001 0.0014 0.0011 0.0015 0.0021 PRD 0.0021 0.1159 0.0049 0.1976 0.0045 Metrics S-SGD-based convLSTM ConvLSTM CLR MLP algorithm DSMIA MSE 0.001 0.0014 0.0011 0.0015 0.0021 PRD 0.0021 0.1159 0.0049 0.1976 0.0045 Open in new tab 5.5.2. Using dataset 4 The analysis using dataset 4 is demonstrated in Fig. 7. Figure 7a depicts the analysis based on the MSE. The MSE of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA at the training percentage of 0.4 is 0.0011, 0.0013, 0.0011, 0.001 and 1, respectively. The training percentage is increased further, and at the training percentage of 0.85, the MSE of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA is 0.0009, 0.0015, 0.0011, 0.0014 and 0.0009, respectively. Figure 7b shows the analysis based on the PRD of the methods. The PRD of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA at the training percentage of 0.4 is 2.9283, 8.5919, 7.5332, 12.5836 and 51.7960, respectively. The training percentage is increased further, and at the training percentage of 0.85, the PRD of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA is 3.4588, 9.5439, 7.8747, 12.5918 and 6.629, respectively. 5.5.3. Using dataset 5 The analysis using dataset 5 is demonstrated in Fig. 8. Figure 8a depicts the analysis based on the MSE. The MSE of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA at the training percentage of 0.4 is 0.0632, 0.9638, 0.0632, 0.8653 and 0.0738, respectively. The training percentage is increased further, and at the training percentage of 0.85, the MSE of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA is 0.1560, 0.9656, 0.1560, 0.8597 and 0.1571, respectively. Figure 8b shows the analysis based on the PRD of the methods. The PRD of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA at the training percentage of 0.4 is 1.8959, 14.54, 1.8959, 12.7515 and 2.0978, respectively. The training percentage is increased further, and at the training percentage of 0.85, the PRD of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA is 2.5948, 14.874, 3.0396, 12.7522 and 3.0895, respectively. 5.5.4. Using dataset 6 The analysis using dataset 6 is demonstrated in Fig. 9. Figure 9a depicts the analysis based on the MSE. The MSE of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA at the training percentage of 0.4 is 0.1802, 0.7536, 0.7629, 0.9758 and 0.1802, respectively. The training percentage is increased further, and at the training percentage of 0.85, the MSE of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA is 0.2732, 0.8214, 0.8269, 0.9282 and 0.2732, respectively. Figure 9b shows the analysis based on the PRD of the methods. The PRD of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA at the training percentage of 0.4 is 1.6308, 7.5663, 7.6869, 12.7064 and 3.0598, respectively. The training percentage is increased further, and at the training percentage of 0.85, the PRD of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA is 2.044, 8.1655, 8.295, 12.7075 and 3.9979, respectively. 5.6. Comparative discussion The comparative analysis of the methods is deliberated in Table 1 based on MSE and PRD. The methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA acquired the MSE of 0.001, 0.0014, 0.0011, 0.0015 and 0.0021, respectively. The PRD of the methods, S-SGD-based convLSTM, convLSTM, CLR, MLP algorithm and DSMIA is 0.0021, 0.1159, 0.0049, 0.1976 and 0.0045, respectively. It is clear from the above comparison that the proposed method outperforms the existing methods in terms of PRD and MSE of the methods. It is clear from the comparative results that the proposed method is better when compared with the existing methods. 6. CONCLUSION The rainfall prediction model is based on the deep learning framework for which the long short-term memory (LSTM) system is interpreted in order to render a better prediction performance. The rainfall data is a time-series data, and it suffers from the dynamic data update, and hence, the proposed prediction model based on the MapReduce framework deals with the dynamically updating rainfall data effectively. The MapReduce framework utilizes the deep learning framework based on convLSTM, which is tuned optimally using the proposed S-SGD. The proposed algorithm enables the optimal tuning of the weights in the LSTM, and the proposed algorithm is the modification of stochastic gradient descent algorithm with the Salp swarm algorithm. The effectiveness of the proposed rainfall prediction model is proved through the analysis using six datasets taken from the Rainfall Prediction database. The prediction accuracy of the proposed model is high, which is revealed through the minimum value of the error in terms of MSE and PRD, which is found to be minimal of 0.001 and 0.0021, respectively. The proposed method is designed for predicting the rainfall only in India. In the future, the capability of the proposed method will be improved to predict the rainfall in other countries, which will enable farmers to do efficient farming across the country. Conflict of Interest: None References [1]. Majumdar , J. , Naraseeyappa , S. and Ankalaki , S. ( 2017 ). Analysis of agriculture data using data mining techniques: application of big data, Journal Big Data, vol. 4, no. 1. [2]. Rajeswari , S. , Suthendran , K. and Rajakumar , K. ( 2017 ) A smart agricultural model by integrating IoT, mobile and cloud-based big data analytics . 2017 Int. Conf. Intell. Comput. Control , 1 – 5 . WorldCat [3]. Abishek , B. ( 2017 ). Prediction of Effective Rainfall and Crop Water Needs using Data Mining Techniques , Tiar . [4]. Vijaya , P. Raju , G and Ray , S.K. ( 2016 ). Artificial neural network-based merging score for Meta search engine . Journal of Central South University . 23 , 2604 – 2615 . Google Scholar Crossref Search ADS WorldCat [5]. Dabral , P.P and Murry , M.Z. ( 2017 ). Modelling and forecasting of rainfall time series using SARIMA , Environ. Process , 4 ( 2 ), 399 – 419 . Google Scholar Crossref Search ADS WorldCat [6]. Rahman , M.A . Yunsheng , L and Sultana , N . ( 2017 ). Analysis and prediction of rainfall trends over Bangladesh using Mann–Kendall, Spearman’s rho tests and ARIMA model, Meteorol . Atmos. Phys ., 129 ( 4 ), 409 – 424 . Google Scholar Crossref Search ADS WorldCat [7]. Bhomia , S. Jaiswal , N. Kishtawal , C.M and Kumar , R. ( 2016 ). Multimodel prediction of monsoon rain using dynamical model selection . IEEE Trans. Geosci. Remote Sens ., 54 ( 5 ), 2911 – 2917 . Google Scholar Crossref Search ADS WorldCat [8]. Pandey , A.K. Agrawal , C.P. and Agrawal , M. ( 2017 ). A hadoop based weather prediction model for classification of weather data . Proc. 2017 2nd IEEE Int. Conf. Electr. Comput. Commun. Technol. ICECCT 2017 , pp. 1 – 5 . [9]. Bagirov , A.M. , Mahmood , A. and Barton , A. ( 2017 ) Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach . Atmos. Res. , 188 , 20 – 29 . Google Scholar Crossref Search ADS WorldCat [10]. Beheshti , Z . Firouzi , M . Shamsuddin , S.M . Zibarzani , M and Yusop , Z . ( 2016 ). A new rainfall forecasting model using the CAPSO algorithm and an artificial neural network . Neural Comput. Appl ., 27 ( 8 ), 2551 – 2565 . Google Scholar Crossref Search ADS WorldCat [11]. Ramu , D.A. et al. ( 2017 ) Prediction of seasonal summer monsoon rainfall over homogenous regions of India using dynamical prediction system . J. Hydrol. , 546 , 103 – 112 . Google Scholar Crossref Search ADS WorldCat [12]. Reddy , P.C and Babu , A.S. ( 2017 ). Survey on weather prediction using big data analystics . 2017 Second Int. Conf. Electr. Comput. Commun. Technol ., pp. 1 – 6 . [13]. Wahyuni , I . Mahmudy , W.F and Iriany , A . ( 2017 ). Rainfall prediction using hybrid adaptive neuro-fuzzy inference system (ANFIS) and genetic algorithm . J. Telecommun. Electron. Comput. Eng ., 9 ( 2–8 ), 51 – 56 . WorldCat [14]. Mohan , P . ( 2018 ). Weather and crop prediction using modified self organizing map for Mysore Region, 11 ( 2 ). [15]. Hussain , A.J . Liatsis , P and Khalaf , M . ( 2018 ). A dynamic neural network architecture with immunology inspired optimization for weather data forecasting Khalifa University of Science and Technology Department of Computer Science Corresponding Author email : a.hussain@ljmu.ac.uk, Big Data Res. [16]. Vathsala , H. and Koolagudi , S.G. ( 2017 ) Prediction model for peninsular Indian summer monsoon rainfall using data mining and statistical approaches . Comput. Geosci. , 98 , 55 – 63 . Google Scholar Crossref Search ADS WorldCat [17]. Manogaran , G. and Lopez , D. ( 2018 ) Spatial cumulative sum algorithm with big data analytics for climate change detection . Comput. Electr. Eng. , 65 , 207 – 221 . Google Scholar Crossref Search ADS WorldCat [18]. Singh , P. ( 2016 ) Rainfall and financial forecasting using fuzzy time series and neural networks based model . Int. J. Mach. Learn. Cybern . WorldCat [19]. Nair , A . Singh , G and Mohanty , U.C . ( 2018 ). Prediction of monthly summer monsoon rainfall using global climate models through artificial neural network technique . Pure Appl. Geophys. , 175 ( 1 ), 403 – 419 . Google Scholar Crossref Search ADS WorldCat [20]. Chatterjee , S . Datta , B and Dey , N . ( 2018 ). Hybrid neural network based rainfall prediction supported by flower pollination algorithm . Neural Network World , 28 ( 6 ), 497 – 510 . Google Scholar Crossref Search ADS WorldCat [21]. Zhang , D. , Lindholm , G. and Ratnaweera , H. ( 2018 ) Use long short-term memory to enhance Internet of Things for combined sewer overflow monitoring . J. Hydrol , 556 , 409 – 418 . Google Scholar Crossref Search ADS WorldCat [22]. Shi , X. , Chen , Z. , Wang , H. , Yeung , D.Y. , Wong , W. and Woo , W. ( 2015 ) Convolutional LSTM Network. In A Machine Learning Approach for Precipitation Nowcasting . [23]. Chen , J. , Zeng , G.Q. , Zhou , W. , Du , W. and Lu , K.D. ( 2018 ) Wind speed forecasting using nonlinear-learning ensemble of deep learning time series prediction and extremal optimization . Energy Convers. Manag , 165 , 681 – 695 . Google Scholar Crossref Search ADS WorldCat [24]. Kim , S. Hong , S. Joh , M and Song , S. ( 2017 ). DeepRain: convLSTM network for precipitation prediction using multichannel radar data, 3 – 6 . [25]. Yang , J and Yang , G. ( 2018 ). Modified convolutional neural network based on dropout and the stochastic gradient descent optimizer . Algorithms , 11 ( 3 ), 28 . Google Scholar Crossref Search ADS WorldCat [26]. Mirjalili , S. , Gandomi , A.H. , Mirjalili , S.Z. , Saremi , S. , Faris , H. and Mirjalili , S.M. ( 2017 ) Salp swarm algorithm: a bio-inspired optimizer for engineering design problems . Adv. Eng. Softw , 114 , 163 – 191 . Google Scholar Crossref Search ADS WorldCat [27]. “Rainfall prediction dataset.” [Online]. Available: https://data.gov.in/catalog/rainfall-india. [28]. Blanco-Velasco , M . Cruz-Roldan , F . Llorente , J and López-Ferreras , F . ( 2005 ). On the use of PRD and CR parameters for ECG compression . Med. Eng. Phys. , 27 ( 9 ), 798 – 802 . Google Scholar Crossref Search ADS PubMed WorldCat © The British Computer Society 2020. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) TI - MapReduce and Optimized Deep Network for Rainfall Prediction in Agriculture JF - The Computer Journal DO - 10.1093/comjnl/bxz164 DA - 2020-06-18 UR - https://www.deepdyve.com/lp/oxford-university-press/mapreduce-and-optimized-deep-network-for-rainfall-prediction-in-Fr4EuVnHWM SP - 1 VL - Advance Article IS - DP - DeepDyve ER -