Short-term load forecasting in smart meters with sliding window-based ARIMA algorithms

Short-term load forecasting in smart meters with sliding window-based ARIMA algorithms Forecasting of electricity consumption for residential and industrial customers is an important task providing intelligence to the smart grid. Accurate forecasting should allow a utility provider to plan the resources as well as to take control actions to balance the supply and the demand of electricity. This paper presents two non-seasonal and two seasonal sliding window-based ARIMA (auto regressive integrated moving average) algorithms. These algorithms are developed for short-term forecasting of hourly electricity load at the district meter level. The algorithms integrate non-seasonal and seasonal ARIMA models with the OLIN (online information network) methodology. To evaluate our approach, we use a real hourly consumption data stream recorded by six smart meters during a 16-month period. Keywords Internet of things · Smart city · Smart grid · Short-term forecasting · Incremental learning · Online information network · Sliding window · ARIMA 1 Introduction changes in the number of people populating the buildings at various times, introduction of new electric appliances, Smart grid is becoming an increasingly popular application etc. Hence, short-term load forecasting (STLF) algorithms of the Internet of things (IoT). The smart grid includes a should be responsive to these changes by quickly learning variety of operational and energy components connected to new consumption patterns and modifying the forecasting the Internet such as smart switches, smart meters, and smart models accordingly. The problem of a gradual “drift” in the appliances. Smart meters are aimed at monitoring and con- target concept is handled by incremental learning systems trolling household energy consumption in real time [2]. They via forgetting outdated data and adapting to the most recent enable two-way communication between the utility company phenomena [3]. However, the traditional ARIMA (auto and the customer. Their sampling rate usually varies from 10 regressive integrated moving average) algorithms, which are minto1hwithamaximumlatencyof24h. commonly used for short-term load forecasting, are lack- The massive amounts of measurement data collected by ing such an incremental learning mechanism: they learn the smart meters can be used for customers’ load forecasting. parameters of a given ARIMA model only once using a fixed However, power consumption patterns in both residential training set and then apply that model to all future incoming and non-residential buildings may change over time due to data. multiple reasons including variability of human behavior, In this work, we integrate non-seasonal and seasonal ARIMA modeling with the OLIN (on line information Mark Last network) incremental learning methodology, which was pre- mlast@bgu.ac.il viously developed by Last [7] for classification tasks in Dima Alberg the presence of a concept drift. The proposed incremen- albergd@gmail.com tal ARIMA system is aimed at continuously processing an infinite stream of incoming data such as a series of load Department of Industrial Engineering and Management, SCE-Shamoon College of Engineering, Bialik St., measurements at an hourly or daily resolution. It periodi- Beer-Sheva, Israel cally rebuilds the predictive model using the sliding window Department of Software and Information Systems approach. We implement two non-seasonal and two seasonal Engineering, Ben-Gurion University of the Negev, sliding window-based ARIMA algorithms and evaluate them 84105 Beer-Sheva, Israel 123 242 Vietnam Journal of Computer Science (2018) 5:241–249 Next training and Meta-learning on a real-world consumption data stream recorded by six Module validation windows smart meters during a 16-month period. Validation MAPE Training MAPE Learning Module Prediction Module 2 Related work Repository (ARIMA) (ARIMA) of Models Penya et al. [9] present short-term load forecasting models Validation Training for non-residential buildings. According to the authors, this Window Window special domain presents different characteristics: there is no T - 1 T T + 1 consumption at night, or it is negligible, and anyway, there Previous Window Current Window Next Window exists a notable gap between idle and activity times. Another Data Stream critical aspect is that usually, there is scarce (if any) historical Fig. 1 Incremental OLIN learning with ARIMA data on hourly load and the load profile is sure to vary and evolve over the time. The forecasting results presented for the consumption data from a university campus shows that autoregressive models, being computationally simple, accu- 3 Proposed methodology rate, fast, and not requiring any trial-and-error customization or external data (e.g., temperature), are sufficient for pro- 3.1 The incremental ARIMA paradigm viding acceptable prediction accuracy up to six days ahead (MAPE, mean absolute percentage error, between 5 and The proposed paradigm, called “Incremental ARIMA”, is 11%). Other evaluated models, including linear and nonlinear presented in Fig. 1. Our incremental ARIMA system is com- regression, a neural network, a support vector machine, and posed of the following four components: a Bayesian network, have provided higher values of MAPE. The latest studies presented in most recent articles provide • Learning module it takes as input a sliding training contradictory results. In Høverstad et al. [6], ARIMA meth- window of a given size (in terms of the number of obser- ods achieve better results than artificial neural networks. On vations) and calculates the parameters for a given set of the other side, Veit et al. [10] claim that a neural network per- ARIMA models (seasonal or non-seasonal) as well as the forms slightly better than the ARIMA methods. Gerwig [4] training MAPE (mean absolute percentage error) of each points out that comparing the results of the different papers induced model. is difficult as the evaluations do vary not only in various con- • Repository of models it serves for storing the ARIMA sumption data sets but also in the length of the time series models induced from the latest training window. forecasting horizon, granularity, and the choice of error mea- • Prediction module it takes as input a sliding validation sures. window of a given size (a “prediction horizon” such as Gerwig [5] evaluates five state-of-the-art approaches to the next 24 h) and calculates the validation MAPE (mean short-term load forecasting on three publicly available data absolute percentage error) for each ARIMA model stored sets of power consumption in residential buildings. The fol- in the Repository of Models. lowing forecasting methods are chosen for evaluation: an • Meta-learning module it takes as input the training and autoregressive model (AR), k-nearest neighbor regression the validation MAPE of each [S]ARIMA model and (KNN), decision trees (DT), random forest regression (RF), chooses the most accurate model, which has the low- and kernel ridge regression (KRR). In addition, two sim- est value of validation MAPE. It also computes the start ple benchmarks are used: a persistent forecast (PER), where and the end points of the next training and validation the predicted values are equal to the last observation, and windows, so that both the end point of a new training an averaging method (AVG), where the predicted values are window and the start point of a new validation window the average of the training data for the specific time of day. are set to the end point of the previous validation window. Compared to other methods, the autoregressive model and To respond to a concept drift, a new [S]ARIMA model is the KNN method achieve the best results (MAPE of about induced from the latest training window every time the 30% in a 24-h forecast for a single household). validation error of the current model exceeds its training This paper contributes to our conference paper Alberg error by a pre-defined threshold Th. and Last [1] in terms of a more detailed explanation of the proposed algorithms and presentation of new results. 123 Vietnam Journal of Computer Science (2018) 5:241–249 243 3.2 The incremental ARIMA algorithms hourly consumption. On the other hand, the SWDP2A and SWDPSA utilize daily consumption records in the train- Seasonal autoregressive integrated moving average ing window for calculating the parameters of daily ARIMA (SARIMA) models intend to describe the current behavior and SARIMA models. In addition, the hourly consumption of variables in terms of linear relationships with their past records in the training window are used by SWDP2A and values. These models are also called Box–Jenkins models SWDPSA for calculating the 24-h average daily profile of following the Box and Jenkins (1984) pioneering work on hourly consumption. The daily ARIMA and SARIMA mod- time series forecasting techniques. A SARIMA model can els are applied recursively to each day in the validation be decomposed into four parts. First, it has an integrated (I) window for predicting the overall daily consumption and component (d), which represents the amount of differenc- then combined with the mean 24-h profile for predicting the ing to be performed on the series to make it stationary. The consumption during each hour. second ARIMA component consists of an ARMA (autore- The following flowchart gives an intuitive representation gressive moving average) model for the series rendered of the sliding window hourly ARIMA algorithm (SWH2A) stationary through differentiation. The third component is and sliding window hourly seasonal ARIMA algorithm a seasonal component, and finally, the fourth component is (SWHSA): the seasonality period parameter. The ARMA and SARMA The pseudocode of the sliding window hourly ARIMA components are further decomposed into the corresponding algorithm (SWH2A) and sliding window hourly seasonal AR (autoregressive) and MA (moving average) non-seasonal ARIMA algorithm (SWHSA) is as follows: and seasonal components, respectively. The AR and seasonal Input Training window size W (in hours) AR components capture the correlation between the current tr Prediction (Validation) window size W (in hours) val value of the time series and some of its past non-seasonal Starting time of the data stream t (in hours) start and seasonal adjusted values. For example, AR(1) means ARIMA model type (p, d, q) SARIMA model type (p, d, q) with seasonal component is (0,1,1) that the current observation is correlated with its immedi- Seasonality Period for SARIMA models is 24 hours ate past value at time t − 1. The Moving Average MA and Th – concept drift threshold Output seasonal MA components represent the duration of the influ- MAPE – the training window MAPE Tr ence of a random non-seasonal and seasonal adjusted error. MAPE – the validation window MAPE Val Algorithm For example, MA(1) means that the error in the value of Initialize the start point of the training window t = t 1 start the series at time t is correlated with the shock at t − 1. Compute the end point of the training window t = t + W 2 1 tr MAPE = MAPE The last thing to note is that most real-world time series Val Tr While new data arrives do: are non-stationary, whereas ARIMA and SARIMA models If (MAPE / MAPE > Th) Val Tr Induce hourly (S)ARIMA model from the hourly data in the training usually refer to a stationary time series. Therefore, it is nec- window [t ; t ]: 1 2 essary to have a notational distinction between the original (S)ARIMA-H = Model (hourlyData, p, d, q, t , t (0, 1, 1), 24) 1 2, Calculate MAPE for the training window [t ; t ]: non-stationary time series and its stationary counterpart after Tr 1 2 MAPE = MAPE (hourlyData, (S)ARIMA-H, t , t ) Tr 1 2 differencing or logging. Compute the start point of the validation window t = t 3 2 Compute the end point of the validation window t = t + W We have used the paradigm described in the previous sec- 4 3 val Calculate MAPE for the validation window [t ; t ] : Val 3 4 tion to evaluate four incremental non-seasonal and seasonal MAPE = MAPE (hourlyData, (S)ARIMA-H, t , t ) Val 3 4 Compute the start point of the training window t = t -W (S)ARIMA algorithms: sliding window hourly ARIMA algo- 1 4 tr Compute the end point of the training window t = t 2 4 rithm (SWH2A), sliding window hourly seasonal ARIMA Loop Return MAPE , MAPE algorithm (SWHSA), sliding window daily profile ARIMA Tr Val algorithm (SWDP2A), and window daily profile seasonal The following flowchart represents the sliding window ARIMA algorithm (SWDPSA). The SWH2A and SWHSA daily profile SWDP2A and SWDPSA algorithms as follows algorithms utilize hourly consumption records in the train- (Figs. 2, 3): ing window for calculating the parameters of hourly ARIMA The pseudocode of the sliding window daily profile and SARIMA models. Those models are applied recursively SWDP2A and SWDPSA algorithms is as follows: to each hour in the validation window for predicting the 123 244 Vietnam Journal of Computer Science (2018) 5:241–249 Input Y = μ + Y + φ(Y − Y ) − θ · e , (1) t t −1 t −1 t −2 t −1 Training window size W (in days) tr Prediction (Validation) window size W (in days) val Starting time of the data stream t (in days) start where Y and Y stand for the actual load during the t −1 t −2 ARIMA model type (p, d, q) hours/days t −1 and t −2, respectively, e is the forecasting t −1 SARIMA model type (p, d, q) with seasonal component is (0,1,1) Seasonality Period for SARIMA models is 7 days error for the hour/day t −1, and the coefficients μ, φ, and θ are Th – concept drift threshold estimated from the hourly/daily data in the training window Output using a model fitting technique. When the prediction horizon MAPE – the training window MAPE Tr (validation window) exceeds one time unit (one hour or one MAPE – the validation window MAPE Val Algorithm day, respectively), the actual values of Y and Y in Eq. 1 t −1 t −2 Initialize the start point of the training window t = t 1 start above may be replaced with their forecasted values, so that Compute the end point of the training window t = t + W 2 1 tr MAPE = MAPE Val Tr (S)ARIMA model can be applied recursively to a series of While new data arrives do: observations. If (MAPE / MAPE > Th) Val Tr Induce daily (S)ARIMA model (S)ARIMA-D from the daily data in the training window [t ; t ]: 1 2 (S)ARIMA-D = Model (dailyData, p, d, q, t , t ) 3.3 The evaluated algorithms 1 2 Calculate the 24-hour load profile hoursProfile from the hourly da- ta in the training window [t ; t ] 1 2 Calculate MAPE for the training window [t ; t ]: Tr 1 2 The training/validation MAPE (mean absolute percentage MAPE = MAPE (hourlyData, hoursProfile, (S)ARIMA-D, t , t ) Tr 1 2 error) of a given forecasting model is calculated on N con- Compute the start point of the validation window t = t 3 2 Compute the end point of the validation window t = t + W 4 3 val secutive load measurements as follows: Calculate MAPE for the validation window [t ; t ] : Val 3 4 MAPE = MAPE (hourlyData, hoursProfile, (S)ARIMA-D, t , t ) Val 3 4 N Y −Y i i Compute the start point of the training window t = t -W 1 4 tr × 100 i =1 Y Compute the end point of the training window t = t 2 4 MAPE = , (2) Loop Return MAPE , MAPE Tr Val where Y and Y stand for the predicted and the actual load, i i The formula of the (S)ARIMA forecasting model depends respectively. on the following seven parameters [8]: The 24-h load profile is calculated from the hourly data in the training window by applying the following equation to each hour of the day: • p is the number of non-seasonal autoregressive terms. • d is the number of non-seasonal differences. dh d=1 • q is the number of lagged non-seasonal forecast errors in P = , (3) the prediction equation. • ps is the number of seasonal autoregressive terms. where P is the average hourly load during the hour h(h ∈ • ds is the number of seasonal differences. [1, 24]), Y is the actual load during the hour h of day d, and dh • qs is the number of lagged seasonal forecast errors in the D is the number of days in the training window. The average prediction equation. daily load P can be found by summing up the values of P D h • P is the seasonal periodicity. over all hours of the day: For example, the forecasting equation of ARIMA (1, 1, 1) P = P . (4) D h for the hourly/daily load during the hour/day t is: h=1 Fig. 2 SWH[2]SA flowchart algorithm representation 123 Vietnam Journal of Computer Science (2018) 5:241–249 245 Fig. 3 SWDP[2]SA flowchart algorithm representation The sliding window daily profile (S)ARIMA SWDP2A and 4 Evaluation experiments SWDPSA algorithms use the following equation to predict the hourly load during the hour h of the day d: 4.1 Design of experiments Our evaluation experiments were based on the electricity hourly load data recorded by six Powercom (http://www. Y = × P , (5) dh h P powercom.co.il) meters during a 16-month period between 01/12/2012 and 31/03/2014. The meters were installed in different districts of a major Israeli city. The total number where Y is the predicted daily load for the day d calculated of recorded hourly observations was 61,646. The experi- by a daily (S)ARIMA models, P is the average daily load ments with four algorithms (SWH2A, SWHSA, SWDP2A, calculated by Eq. 3 above, and P is the average hourly load and SWDPSA) included nine non-seasonal hourly and nine for the hour h calculated by Eq. 2 above. non-seasonal daily ARIMA models, nine seasonal hourly In our experiments, we have also evaluated two “naïve” SARIMA models with the period parameter of 24 h, nine models: the “naïve hourly” based on the hourly consumption seasonal daily SARIMA models (see footnote 2) with the during the previous day and “naïve daily profile” based on the period parameter of 7 days, and two baseline models: the daily consumption during the previous day and the average “naïve hourly” model and the “naïve daily profile” model. daily consumption profile during the training period. The The experiments with the SWH2A, SWHSA, and the “naïve “naïve hourly” model calculates the forecasted hourly load hourly” models included three sizes of the training window during the hour hof the day d by the following equation: (24, 48, and 96 days) and four sizes of the validation window (24, 48, 72, and 96 h). The experiments with the SWDP2A, SWDPSA, and the “naïve daily profile” models included Y = Y , (6) dh d−1,h three sizes of the training window (24, 48, and 96 days) and three sizes of the validation window (1, 2, and 3 days). The total number of models evaluated with each algorithm was where Y is the actual load measured during the same d−1,h 4 × 38 × 3 × 3 = 540. hour h on the previous day d − 1. The “naïve daily profile” We have explored the seasonality behavior of all meters model calculates the forecasted load during the day d by the by building the average MW (megawatt) consumption plots following equation: for monthly, daily, and hourly cycles of collected data. The plots showed in Figs. 4, 5, and 6 exhibit seasonality patterns in monthly, daily, and hourly profiles, respectively. Y = Y . (7) d d−1 ARIMA models: (000, 001, 100, 101, 010, 011, 111, 221, 222). Then, it estimates the hourly load during each hour of the SARIMA models: (000, 001, 100, 101, 010, 011, 111, 221, 222) day d using Eq. 4 above. (0,1,1). 123 246 Vietnam Journal of Computer Science (2018) 5:241–249 Fig. 4 Monthly consumption cycle for all meters Fig. 5 Daily consumption cycle for all meters Fig. 6 Hourly consumption cycle for all meters 123 Vietnam Journal of Computer Science (2018) 5:241–249 247 Figure 2 demonstrates a strong seasonality pattern expressed MAPE % by high energy consumption in winter months versus other months. In addition, the months of June, July, and August represent relatively high and stable energy consumption. This fact coincides with the high temperatures in Israel when many residents are using air condition in their homes. Figure 5 3 demonstrates a weak daily data seasonality characterized STDEV[SWDP2A] STDEV[SWHSA] Model by electricity load decrease in the last 3 days of the week (Thursday, Friday, and Saturday). This result is obvious, because Friday and Saturday are official weekend days in Israel and most organizations are closed, and consequently, they consume less electricity on these days. In contrast, Fig. 6 Fig. 7 Algorithm stability comparison in terms of standard deviation demonstrates a strong hourly seasonality pattern expressed by the electricity load increasing from 05:00 to 11:00 and then again between 17:00 and 21:00, particularly during the work- best result (MAPE = 9.19%) is obtained with the seasonal days (Sunday–Thursday). The first pattern may be related to SARIMA (001) model. Figure 7 shows that the SWDP2A the start of a business day at many workplaces, whereas the algorithm has a more stable MAPE performance in terms of second one may be explained by many people returning home standard deviation than the SWHSA algorithm. from work and starting to use the electrical appliances at their Table 2 compares the average validation MAPE for each homes, while at the same time, many companies are starting meter across various models and training/validation window their afternoon shifts. It is also noteworthy that after 21:00, sizes. In terms of the average MAPE, the daily profile model the load starts decreasing as people are going to sleep and SWDP2A strongly outperforms the non-seasonal SWH2A stop using most of their appliances. hourly model (10.409 vs. 21.482%), slightly outperforms seasonal SWDPSA hourly model (10.409 vs. 11.451%), and 4.2 Results has a similar performance to the SWHSA seasonal hourly model according to the paired sample t test at the 99% sig- Table 1 compares the average performance of four (S)ARIMA nificance level. and the “naïve” models across all meters and training/ Table 3 also shows the results of one-way ANOVA test- validation window sizes. The sliding window hourly ARIMA ing for statistical significance of the difference between the (SWH2A) models performed significantly worse than the meters. The conclusion of one-way ANOVA is that the dif- other models: their average validation MAPE values are ference is not significant, implying that we can safely refer about two times higher than the “naïve hourly” MAPE to consolidated results of all 6 m. (11.804%). Out of the sliding window daily profile ARIMA Table 4 compares the average validation MAPE for dif- models (SWDP2A and SWDPSA), the best result (MAPE ferent sizes of the training window across various models, = 10.05%) is obtained with the non-seasonal SWDP2A meters, and validation window sizes. It shows that, in general, ARIMA (101) model. Similarly, out of the sliding window increasing the training window size improves the forecasting hourly profile ARIMA models (SWH2A and SWHSA), the performance (reduces MAPE), which indicates a relatively Table 1 Comparison of Model SWDP2A SWDPSA SWH2A SWHSA Average ARIMA models in terms of Avg. MAPE [000] 11.592 11.457 22.826 9.689 16.454 [001] 10.909 11.038 21.870 9.191 15.284 [010] 10.329 16.708 23.631 11.600 14.527 [011] 10.261 11.444 23.715 11.221 14.280 [100] 10.363 10.241 21.640 9.246 14.643 [101] 10.050 10.218 21.790 9.238 14.704 [111] 10.155 15.045 23.864 9.543 13.910 [221] 10.393 15.153 22.795 11.652 13.887 [222] 10.358 11.331 20.999 10.432 13.219 Naïve daily 10.779 Naïve hourly 11.804 123 248 Vietnam Journal of Computer Science (2018) 5:241–249 Table 2 Comparison of meters Meter SWDP2A SWDPSA SWH2A SWHSA Average (h) 2478 9.627 11.331 20.083 9.740 13.215 4364 7.748 7.613 21.743 8.478 12.891 4429 14.618 15.675 21.231 15.293 16.902 4470 7.877 7.680 16.584 9.139 11.170 4740 11.042 11.643 30.980 11.559 17.260 5521 10.823 11.020 20.217 9.585 13.944 Average 10.409 11.451 21.482 10.622 14.230 Table 3 ANOVA comparison of Source of Variation SS df MS FP value F crit meters Between groups 134.41 5 26.88 0.73 0.61 2.77 Within groups 658.94 18 36.61 Total 793.34 23 Table 4 Comparison of training Train. window SWDP2A SWDPSA SWH2A SWHSA Average window sizes (h) 576 10.729 17.232 22.238 11.642 15.035 1152 10.325 15.136 21.533 11.011 14.573 2304 10.170 11.163 20.688 9.777 13.296 Average 10.409 11.451 21.482 10.622 14.230 Table 5 Comparison of Val. window SWDP2A SWDPSA SWH2A SWHSA Average validation window sizes (h) 24 10.009 11.211 20.533 9.532 13.052 48 10.443 11.515 21.334 10.324 13.665 72 10.787 11.642 21.957 11.213 14.177 96 9.047 11.331 22.103 11.428 18.036 Average 10.409 11.451 21.482 10.622 14.230 stable behavior of most meters during the period of at least 96 extending the prediction horizon reduces the performance of days (about three months). However, there was 1 m (4429) the forecasting models (increases MAPE). Apparently, the with the best training window size (providing the lowest value rate of MAPE increase is going down for the SWH2A algo- of MAPE) of 48 days and another meter (5521) with the best rithm between the window sizes of 72 and 96 h. These results training window size of 576 h (24 days) only. Apparently, confirm the common knowledge that it is more difficult to these 2 m were exposed to a faster concept drift than the predict a more distant future. other four ones. Finally, Table 6 shows the best configuration of algorithm, Table 5 compares the average validation MAPE for dif- model, and training window size across all 6 m for each ferent sizes of the validation window across various models, size of the validation window. The conclusion is that the meters, and training window sizes. It shows that on average, seasonal SWHSA algorithm works best for the 24- and 48-h Table 6 Best configuration for Val. window Min. Avg. MAPE Algorithm Model Train. window each prediction horizon (h) 24 9.532 SWHSA SARIMA(1,0,1) 2304 48 10.324 SWHSA SARIMA(0,0,1) 2304 72 10.787 SWDP2A ARIMA(1,0,1) 2304 96 9.047 SWDP2A ARIMA(0,1,1) 1152 123 Vietnam Journal of Computer Science (2018) 5:241–249 249 validation window sizes. For the two larger windows (48 and Acknowledgements This work was partially supported by the Israel Smart Grid (ISG) Consortium under the MAGNET Program, in the 72 h), the non-seasonal SWDPA algorithm induces the most office of the Chief Scientist of the Ministry of Economics in Israel. accurate forecasting models. The maximum training window size of 96 days (2304 h) is the best one for the first three Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecomm configurations. In case of the 96-h validation window (the ons.org/licenses/by/4.0/), which permits unrestricted use, distribution, fourth configuration), the best size of the training window is and reproduction in any medium, provided you give appropriate credit 48 days (1152 h). to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. 5 Discussion and conclusions References The main contribution of this paper is the introduction of sliding window-based forecasting algorithms (SWDP2A, 1. Alberg, D., Last, M.: Short-Term Load Forecasting in Smart Meters with Sliding Window-Based ARIMA Algorithms. ACIIDS2017, SWDPSA, SWH2A, SWHSA, and SWHSA) for electricity (pp. 299-307) (2017) load prediction in smart meters. These algorithms integrate 2. Depuru, S.S., Wang, L., Devabhaktuni, V.: Smart meters for power non-seasonal and seasonal time series (S)ARIMA models grid: Challenges, issues, advantages and status. Renewable and with the OLIN (online information network) incremental Sustainable Energy Reviews 15(6), 2736–2742 (2011) 3. Gama, J.: Knowledge Discovery from Data Streams. CRC Press, learning methodology. The main difference between the pre- Boca Raton (2010) sented algorithms concludes in seasonality adjustment and 4. Gerwig, C.: Short Term Load Forecasting for Residential Buildings the model construction phase. The non-seasonal SWH2A - an Extensive Literature Review, pp. 181–193. Intelligent Decision and seasonal SWHSA algorithms utilize hourly consump- Technologies, (2015) 5. Gerwig, C.: Short Term Load Forecasting for Residential Build- tion records in the training window, whereas non-seasonal ings.InJ.Hu, &e.al., Smart Grid Inspired Future Technologies: SWDP2A and seasonal SWDPSA algorithms utilize aggre- First International Conference, SmartGIFT 2016, Liverpool, UK, gated daily consumptions and average daily profiles of May 19-20, 2016, Revised Selected Papers (pp. 69-78). Springer International Publishing (2016) hourly consumptions to obtain the parameters of induced 6. Høverstad, B., Tidemann, A., Langseth, H.: Effects of Data Cleans- (S)ARIMA models. ing on Load Prediction Algorithms. In IEEE Symposium Series on The experimental data set was recorded online by state- Computational Intelligence 2013. IEEE (2013) of-the-art smart metering technology and, after thorough 7. Last, M.: Online Classification of Nonstationary Data Streams, pp. preprocessing, was approved for use in the corresponding 129–147. Intelligent Data Analysis, (2002) 8. Makridakis, S.G., Wheelwright, S.C., Hyndman, R.J.: Forecasting research experiments. The conducted experiments showed Methods and Applications. John Wiley & Sons, New York, NY that the SWDP2A algorithm outperforms the SW2SA algo- (2008) rithm, performs similarly to the seasonal SWHSA algorithm, 9. Penya, Y., Borges, C., Fernandez, I.: Short-term Load Forecast- and has more stable MAPE performance in terms of standard ing in Non-residential Buildings. AFRICON 2011,9(3), pp. 1-6. Livingstone (2011) deviation than the SWHSA algorithm. This remarkable find- 10. Veit, A., Goebel, C., Rohit, T., Doblander, C., Jacobsen, H.: House- ing indicates that the hourly prediction task does not require hold Electricity Demand Forecasting: Benchmarking State-of-the- collecting massive hourly data in the training phase of model Art Methods. 5th International Conference on Future Energy induction. It is sufficient to use daily consumption data and System (pp. 233-234). ACM (2014) aggregated hourly coefficients of daily profiles for obtaining accurate hourly predictions of electricity load. Publisher’s Note Springer Nature remains neutral with regard to juris- dictional claims in published maps and institutional affiliations. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Vietnam Journal of Computer Science Springer Journals

Short-term load forecasting in smart meters with sliding window-based ARIMA algorithms

Free
9 pages

Loading next page...
 
/lp/springer_journal/short-term-load-forecasting-in-smart-meters-with-sliding-window-based-GNZY8RtSLQ
Publisher
Springer Journals
Copyright
Copyright © 2018 by The Author(s)
Subject
Computer Science; Information Systems and Communication Service; Artificial Intelligence (incl. Robotics); Computer Applications; e-Commerce/e-business; Computer Systems Organization and Communication Networks; Computational Intelligence
ISSN
2196-8888
eISSN
2196-8896
D.O.I.
10.1007/s40595-018-0119-7
Publisher site
See Article on Publisher Site

Abstract

Forecasting of electricity consumption for residential and industrial customers is an important task providing intelligence to the smart grid. Accurate forecasting should allow a utility provider to plan the resources as well as to take control actions to balance the supply and the demand of electricity. This paper presents two non-seasonal and two seasonal sliding window-based ARIMA (auto regressive integrated moving average) algorithms. These algorithms are developed for short-term forecasting of hourly electricity load at the district meter level. The algorithms integrate non-seasonal and seasonal ARIMA models with the OLIN (online information network) methodology. To evaluate our approach, we use a real hourly consumption data stream recorded by six smart meters during a 16-month period. Keywords Internet of things · Smart city · Smart grid · Short-term forecasting · Incremental learning · Online information network · Sliding window · ARIMA 1 Introduction changes in the number of people populating the buildings at various times, introduction of new electric appliances, Smart grid is becoming an increasingly popular application etc. Hence, short-term load forecasting (STLF) algorithms of the Internet of things (IoT). The smart grid includes a should be responsive to these changes by quickly learning variety of operational and energy components connected to new consumption patterns and modifying the forecasting the Internet such as smart switches, smart meters, and smart models accordingly. The problem of a gradual “drift” in the appliances. Smart meters are aimed at monitoring and con- target concept is handled by incremental learning systems trolling household energy consumption in real time [2]. They via forgetting outdated data and adapting to the most recent enable two-way communication between the utility company phenomena [3]. However, the traditional ARIMA (auto and the customer. Their sampling rate usually varies from 10 regressive integrated moving average) algorithms, which are minto1hwithamaximumlatencyof24h. commonly used for short-term load forecasting, are lack- The massive amounts of measurement data collected by ing such an incremental learning mechanism: they learn the smart meters can be used for customers’ load forecasting. parameters of a given ARIMA model only once using a fixed However, power consumption patterns in both residential training set and then apply that model to all future incoming and non-residential buildings may change over time due to data. multiple reasons including variability of human behavior, In this work, we integrate non-seasonal and seasonal ARIMA modeling with the OLIN (on line information Mark Last network) incremental learning methodology, which was pre- mlast@bgu.ac.il viously developed by Last [7] for classification tasks in Dima Alberg the presence of a concept drift. The proposed incremen- albergd@gmail.com tal ARIMA system is aimed at continuously processing an infinite stream of incoming data such as a series of load Department of Industrial Engineering and Management, SCE-Shamoon College of Engineering, Bialik St., measurements at an hourly or daily resolution. It periodi- Beer-Sheva, Israel cally rebuilds the predictive model using the sliding window Department of Software and Information Systems approach. We implement two non-seasonal and two seasonal Engineering, Ben-Gurion University of the Negev, sliding window-based ARIMA algorithms and evaluate them 84105 Beer-Sheva, Israel 123 242 Vietnam Journal of Computer Science (2018) 5:241–249 Next training and Meta-learning on a real-world consumption data stream recorded by six Module validation windows smart meters during a 16-month period. Validation MAPE Training MAPE Learning Module Prediction Module 2 Related work Repository (ARIMA) (ARIMA) of Models Penya et al. [9] present short-term load forecasting models Validation Training for non-residential buildings. According to the authors, this Window Window special domain presents different characteristics: there is no T - 1 T T + 1 consumption at night, or it is negligible, and anyway, there Previous Window Current Window Next Window exists a notable gap between idle and activity times. Another Data Stream critical aspect is that usually, there is scarce (if any) historical Fig. 1 Incremental OLIN learning with ARIMA data on hourly load and the load profile is sure to vary and evolve over the time. The forecasting results presented for the consumption data from a university campus shows that autoregressive models, being computationally simple, accu- 3 Proposed methodology rate, fast, and not requiring any trial-and-error customization or external data (e.g., temperature), are sufficient for pro- 3.1 The incremental ARIMA paradigm viding acceptable prediction accuracy up to six days ahead (MAPE, mean absolute percentage error, between 5 and The proposed paradigm, called “Incremental ARIMA”, is 11%). Other evaluated models, including linear and nonlinear presented in Fig. 1. Our incremental ARIMA system is com- regression, a neural network, a support vector machine, and posed of the following four components: a Bayesian network, have provided higher values of MAPE. The latest studies presented in most recent articles provide • Learning module it takes as input a sliding training contradictory results. In Høverstad et al. [6], ARIMA meth- window of a given size (in terms of the number of obser- ods achieve better results than artificial neural networks. On vations) and calculates the parameters for a given set of the other side, Veit et al. [10] claim that a neural network per- ARIMA models (seasonal or non-seasonal) as well as the forms slightly better than the ARIMA methods. Gerwig [4] training MAPE (mean absolute percentage error) of each points out that comparing the results of the different papers induced model. is difficult as the evaluations do vary not only in various con- • Repository of models it serves for storing the ARIMA sumption data sets but also in the length of the time series models induced from the latest training window. forecasting horizon, granularity, and the choice of error mea- • Prediction module it takes as input a sliding validation sures. window of a given size (a “prediction horizon” such as Gerwig [5] evaluates five state-of-the-art approaches to the next 24 h) and calculates the validation MAPE (mean short-term load forecasting on three publicly available data absolute percentage error) for each ARIMA model stored sets of power consumption in residential buildings. The fol- in the Repository of Models. lowing forecasting methods are chosen for evaluation: an • Meta-learning module it takes as input the training and autoregressive model (AR), k-nearest neighbor regression the validation MAPE of each [S]ARIMA model and (KNN), decision trees (DT), random forest regression (RF), chooses the most accurate model, which has the low- and kernel ridge regression (KRR). In addition, two sim- est value of validation MAPE. It also computes the start ple benchmarks are used: a persistent forecast (PER), where and the end points of the next training and validation the predicted values are equal to the last observation, and windows, so that both the end point of a new training an averaging method (AVG), where the predicted values are window and the start point of a new validation window the average of the training data for the specific time of day. are set to the end point of the previous validation window. Compared to other methods, the autoregressive model and To respond to a concept drift, a new [S]ARIMA model is the KNN method achieve the best results (MAPE of about induced from the latest training window every time the 30% in a 24-h forecast for a single household). validation error of the current model exceeds its training This paper contributes to our conference paper Alberg error by a pre-defined threshold Th. and Last [1] in terms of a more detailed explanation of the proposed algorithms and presentation of new results. 123 Vietnam Journal of Computer Science (2018) 5:241–249 243 3.2 The incremental ARIMA algorithms hourly consumption. On the other hand, the SWDP2A and SWDPSA utilize daily consumption records in the train- Seasonal autoregressive integrated moving average ing window for calculating the parameters of daily ARIMA (SARIMA) models intend to describe the current behavior and SARIMA models. In addition, the hourly consumption of variables in terms of linear relationships with their past records in the training window are used by SWDP2A and values. These models are also called Box–Jenkins models SWDPSA for calculating the 24-h average daily profile of following the Box and Jenkins (1984) pioneering work on hourly consumption. The daily ARIMA and SARIMA mod- time series forecasting techniques. A SARIMA model can els are applied recursively to each day in the validation be decomposed into four parts. First, it has an integrated (I) window for predicting the overall daily consumption and component (d), which represents the amount of differenc- then combined with the mean 24-h profile for predicting the ing to be performed on the series to make it stationary. The consumption during each hour. second ARIMA component consists of an ARMA (autore- The following flowchart gives an intuitive representation gressive moving average) model for the series rendered of the sliding window hourly ARIMA algorithm (SWH2A) stationary through differentiation. The third component is and sliding window hourly seasonal ARIMA algorithm a seasonal component, and finally, the fourth component is (SWHSA): the seasonality period parameter. The ARMA and SARMA The pseudocode of the sliding window hourly ARIMA components are further decomposed into the corresponding algorithm (SWH2A) and sliding window hourly seasonal AR (autoregressive) and MA (moving average) non-seasonal ARIMA algorithm (SWHSA) is as follows: and seasonal components, respectively. The AR and seasonal Input Training window size W (in hours) AR components capture the correlation between the current tr Prediction (Validation) window size W (in hours) val value of the time series and some of its past non-seasonal Starting time of the data stream t (in hours) start and seasonal adjusted values. For example, AR(1) means ARIMA model type (p, d, q) SARIMA model type (p, d, q) with seasonal component is (0,1,1) that the current observation is correlated with its immedi- Seasonality Period for SARIMA models is 24 hours ate past value at time t − 1. The Moving Average MA and Th – concept drift threshold Output seasonal MA components represent the duration of the influ- MAPE – the training window MAPE Tr ence of a random non-seasonal and seasonal adjusted error. MAPE – the validation window MAPE Val Algorithm For example, MA(1) means that the error in the value of Initialize the start point of the training window t = t 1 start the series at time t is correlated with the shock at t − 1. Compute the end point of the training window t = t + W 2 1 tr MAPE = MAPE The last thing to note is that most real-world time series Val Tr While new data arrives do: are non-stationary, whereas ARIMA and SARIMA models If (MAPE / MAPE > Th) Val Tr Induce hourly (S)ARIMA model from the hourly data in the training usually refer to a stationary time series. Therefore, it is nec- window [t ; t ]: 1 2 essary to have a notational distinction between the original (S)ARIMA-H = Model (hourlyData, p, d, q, t , t (0, 1, 1), 24) 1 2, Calculate MAPE for the training window [t ; t ]: non-stationary time series and its stationary counterpart after Tr 1 2 MAPE = MAPE (hourlyData, (S)ARIMA-H, t , t ) Tr 1 2 differencing or logging. Compute the start point of the validation window t = t 3 2 Compute the end point of the validation window t = t + W We have used the paradigm described in the previous sec- 4 3 val Calculate MAPE for the validation window [t ; t ] : Val 3 4 tion to evaluate four incremental non-seasonal and seasonal MAPE = MAPE (hourlyData, (S)ARIMA-H, t , t ) Val 3 4 Compute the start point of the training window t = t -W (S)ARIMA algorithms: sliding window hourly ARIMA algo- 1 4 tr Compute the end point of the training window t = t 2 4 rithm (SWH2A), sliding window hourly seasonal ARIMA Loop Return MAPE , MAPE algorithm (SWHSA), sliding window daily profile ARIMA Tr Val algorithm (SWDP2A), and window daily profile seasonal The following flowchart represents the sliding window ARIMA algorithm (SWDPSA). The SWH2A and SWHSA daily profile SWDP2A and SWDPSA algorithms as follows algorithms utilize hourly consumption records in the train- (Figs. 2, 3): ing window for calculating the parameters of hourly ARIMA The pseudocode of the sliding window daily profile and SARIMA models. Those models are applied recursively SWDP2A and SWDPSA algorithms is as follows: to each hour in the validation window for predicting the 123 244 Vietnam Journal of Computer Science (2018) 5:241–249 Input Y = μ + Y + φ(Y − Y ) − θ · e , (1) t t −1 t −1 t −2 t −1 Training window size W (in days) tr Prediction (Validation) window size W (in days) val Starting time of the data stream t (in days) start where Y and Y stand for the actual load during the t −1 t −2 ARIMA model type (p, d, q) hours/days t −1 and t −2, respectively, e is the forecasting t −1 SARIMA model type (p, d, q) with seasonal component is (0,1,1) Seasonality Period for SARIMA models is 7 days error for the hour/day t −1, and the coefficients μ, φ, and θ are Th – concept drift threshold estimated from the hourly/daily data in the training window Output using a model fitting technique. When the prediction horizon MAPE – the training window MAPE Tr (validation window) exceeds one time unit (one hour or one MAPE – the validation window MAPE Val Algorithm day, respectively), the actual values of Y and Y in Eq. 1 t −1 t −2 Initialize the start point of the training window t = t 1 start above may be replaced with their forecasted values, so that Compute the end point of the training window t = t + W 2 1 tr MAPE = MAPE Val Tr (S)ARIMA model can be applied recursively to a series of While new data arrives do: observations. If (MAPE / MAPE > Th) Val Tr Induce daily (S)ARIMA model (S)ARIMA-D from the daily data in the training window [t ; t ]: 1 2 (S)ARIMA-D = Model (dailyData, p, d, q, t , t ) 3.3 The evaluated algorithms 1 2 Calculate the 24-hour load profile hoursProfile from the hourly da- ta in the training window [t ; t ] 1 2 Calculate MAPE for the training window [t ; t ]: Tr 1 2 The training/validation MAPE (mean absolute percentage MAPE = MAPE (hourlyData, hoursProfile, (S)ARIMA-D, t , t ) Tr 1 2 error) of a given forecasting model is calculated on N con- Compute the start point of the validation window t = t 3 2 Compute the end point of the validation window t = t + W 4 3 val secutive load measurements as follows: Calculate MAPE for the validation window [t ; t ] : Val 3 4 MAPE = MAPE (hourlyData, hoursProfile, (S)ARIMA-D, t , t ) Val 3 4 N Y −Y i i Compute the start point of the training window t = t -W 1 4 tr × 100 i =1 Y Compute the end point of the training window t = t 2 4 MAPE = , (2) Loop Return MAPE , MAPE Tr Val where Y and Y stand for the predicted and the actual load, i i The formula of the (S)ARIMA forecasting model depends respectively. on the following seven parameters [8]: The 24-h load profile is calculated from the hourly data in the training window by applying the following equation to each hour of the day: • p is the number of non-seasonal autoregressive terms. • d is the number of non-seasonal differences. dh d=1 • q is the number of lagged non-seasonal forecast errors in P = , (3) the prediction equation. • ps is the number of seasonal autoregressive terms. where P is the average hourly load during the hour h(h ∈ • ds is the number of seasonal differences. [1, 24]), Y is the actual load during the hour h of day d, and dh • qs is the number of lagged seasonal forecast errors in the D is the number of days in the training window. The average prediction equation. daily load P can be found by summing up the values of P D h • P is the seasonal periodicity. over all hours of the day: For example, the forecasting equation of ARIMA (1, 1, 1) P = P . (4) D h for the hourly/daily load during the hour/day t is: h=1 Fig. 2 SWH[2]SA flowchart algorithm representation 123 Vietnam Journal of Computer Science (2018) 5:241–249 245 Fig. 3 SWDP[2]SA flowchart algorithm representation The sliding window daily profile (S)ARIMA SWDP2A and 4 Evaluation experiments SWDPSA algorithms use the following equation to predict the hourly load during the hour h of the day d: 4.1 Design of experiments Our evaluation experiments were based on the electricity hourly load data recorded by six Powercom (http://www. Y = × P , (5) dh h P powercom.co.il) meters during a 16-month period between 01/12/2012 and 31/03/2014. The meters were installed in different districts of a major Israeli city. The total number where Y is the predicted daily load for the day d calculated of recorded hourly observations was 61,646. The experi- by a daily (S)ARIMA models, P is the average daily load ments with four algorithms (SWH2A, SWHSA, SWDP2A, calculated by Eq. 3 above, and P is the average hourly load and SWDPSA) included nine non-seasonal hourly and nine for the hour h calculated by Eq. 2 above. non-seasonal daily ARIMA models, nine seasonal hourly In our experiments, we have also evaluated two “naïve” SARIMA models with the period parameter of 24 h, nine models: the “naïve hourly” based on the hourly consumption seasonal daily SARIMA models (see footnote 2) with the during the previous day and “naïve daily profile” based on the period parameter of 7 days, and two baseline models: the daily consumption during the previous day and the average “naïve hourly” model and the “naïve daily profile” model. daily consumption profile during the training period. The The experiments with the SWH2A, SWHSA, and the “naïve “naïve hourly” model calculates the forecasted hourly load hourly” models included three sizes of the training window during the hour hof the day d by the following equation: (24, 48, and 96 days) and four sizes of the validation window (24, 48, 72, and 96 h). The experiments with the SWDP2A, SWDPSA, and the “naïve daily profile” models included Y = Y , (6) dh d−1,h three sizes of the training window (24, 48, and 96 days) and three sizes of the validation window (1, 2, and 3 days). The total number of models evaluated with each algorithm was where Y is the actual load measured during the same d−1,h 4 × 38 × 3 × 3 = 540. hour h on the previous day d − 1. The “naïve daily profile” We have explored the seasonality behavior of all meters model calculates the forecasted load during the day d by the by building the average MW (megawatt) consumption plots following equation: for monthly, daily, and hourly cycles of collected data. The plots showed in Figs. 4, 5, and 6 exhibit seasonality patterns in monthly, daily, and hourly profiles, respectively. Y = Y . (7) d d−1 ARIMA models: (000, 001, 100, 101, 010, 011, 111, 221, 222). Then, it estimates the hourly load during each hour of the SARIMA models: (000, 001, 100, 101, 010, 011, 111, 221, 222) day d using Eq. 4 above. (0,1,1). 123 246 Vietnam Journal of Computer Science (2018) 5:241–249 Fig. 4 Monthly consumption cycle for all meters Fig. 5 Daily consumption cycle for all meters Fig. 6 Hourly consumption cycle for all meters 123 Vietnam Journal of Computer Science (2018) 5:241–249 247 Figure 2 demonstrates a strong seasonality pattern expressed MAPE % by high energy consumption in winter months versus other months. In addition, the months of June, July, and August represent relatively high and stable energy consumption. This fact coincides with the high temperatures in Israel when many residents are using air condition in their homes. Figure 5 3 demonstrates a weak daily data seasonality characterized STDEV[SWDP2A] STDEV[SWHSA] Model by electricity load decrease in the last 3 days of the week (Thursday, Friday, and Saturday). This result is obvious, because Friday and Saturday are official weekend days in Israel and most organizations are closed, and consequently, they consume less electricity on these days. In contrast, Fig. 6 Fig. 7 Algorithm stability comparison in terms of standard deviation demonstrates a strong hourly seasonality pattern expressed by the electricity load increasing from 05:00 to 11:00 and then again between 17:00 and 21:00, particularly during the work- best result (MAPE = 9.19%) is obtained with the seasonal days (Sunday–Thursday). The first pattern may be related to SARIMA (001) model. Figure 7 shows that the SWDP2A the start of a business day at many workplaces, whereas the algorithm has a more stable MAPE performance in terms of second one may be explained by many people returning home standard deviation than the SWHSA algorithm. from work and starting to use the electrical appliances at their Table 2 compares the average validation MAPE for each homes, while at the same time, many companies are starting meter across various models and training/validation window their afternoon shifts. It is also noteworthy that after 21:00, sizes. In terms of the average MAPE, the daily profile model the load starts decreasing as people are going to sleep and SWDP2A strongly outperforms the non-seasonal SWH2A stop using most of their appliances. hourly model (10.409 vs. 21.482%), slightly outperforms seasonal SWDPSA hourly model (10.409 vs. 11.451%), and 4.2 Results has a similar performance to the SWHSA seasonal hourly model according to the paired sample t test at the 99% sig- Table 1 compares the average performance of four (S)ARIMA nificance level. and the “naïve” models across all meters and training/ Table 3 also shows the results of one-way ANOVA test- validation window sizes. The sliding window hourly ARIMA ing for statistical significance of the difference between the (SWH2A) models performed significantly worse than the meters. The conclusion of one-way ANOVA is that the dif- other models: their average validation MAPE values are ference is not significant, implying that we can safely refer about two times higher than the “naïve hourly” MAPE to consolidated results of all 6 m. (11.804%). Out of the sliding window daily profile ARIMA Table 4 compares the average validation MAPE for dif- models (SWDP2A and SWDPSA), the best result (MAPE ferent sizes of the training window across various models, = 10.05%) is obtained with the non-seasonal SWDP2A meters, and validation window sizes. It shows that, in general, ARIMA (101) model. Similarly, out of the sliding window increasing the training window size improves the forecasting hourly profile ARIMA models (SWH2A and SWHSA), the performance (reduces MAPE), which indicates a relatively Table 1 Comparison of Model SWDP2A SWDPSA SWH2A SWHSA Average ARIMA models in terms of Avg. MAPE [000] 11.592 11.457 22.826 9.689 16.454 [001] 10.909 11.038 21.870 9.191 15.284 [010] 10.329 16.708 23.631 11.600 14.527 [011] 10.261 11.444 23.715 11.221 14.280 [100] 10.363 10.241 21.640 9.246 14.643 [101] 10.050 10.218 21.790 9.238 14.704 [111] 10.155 15.045 23.864 9.543 13.910 [221] 10.393 15.153 22.795 11.652 13.887 [222] 10.358 11.331 20.999 10.432 13.219 Naïve daily 10.779 Naïve hourly 11.804 123 248 Vietnam Journal of Computer Science (2018) 5:241–249 Table 2 Comparison of meters Meter SWDP2A SWDPSA SWH2A SWHSA Average (h) 2478 9.627 11.331 20.083 9.740 13.215 4364 7.748 7.613 21.743 8.478 12.891 4429 14.618 15.675 21.231 15.293 16.902 4470 7.877 7.680 16.584 9.139 11.170 4740 11.042 11.643 30.980 11.559 17.260 5521 10.823 11.020 20.217 9.585 13.944 Average 10.409 11.451 21.482 10.622 14.230 Table 3 ANOVA comparison of Source of Variation SS df MS FP value F crit meters Between groups 134.41 5 26.88 0.73 0.61 2.77 Within groups 658.94 18 36.61 Total 793.34 23 Table 4 Comparison of training Train. window SWDP2A SWDPSA SWH2A SWHSA Average window sizes (h) 576 10.729 17.232 22.238 11.642 15.035 1152 10.325 15.136 21.533 11.011 14.573 2304 10.170 11.163 20.688 9.777 13.296 Average 10.409 11.451 21.482 10.622 14.230 Table 5 Comparison of Val. window SWDP2A SWDPSA SWH2A SWHSA Average validation window sizes (h) 24 10.009 11.211 20.533 9.532 13.052 48 10.443 11.515 21.334 10.324 13.665 72 10.787 11.642 21.957 11.213 14.177 96 9.047 11.331 22.103 11.428 18.036 Average 10.409 11.451 21.482 10.622 14.230 stable behavior of most meters during the period of at least 96 extending the prediction horizon reduces the performance of days (about three months). However, there was 1 m (4429) the forecasting models (increases MAPE). Apparently, the with the best training window size (providing the lowest value rate of MAPE increase is going down for the SWH2A algo- of MAPE) of 48 days and another meter (5521) with the best rithm between the window sizes of 72 and 96 h. These results training window size of 576 h (24 days) only. Apparently, confirm the common knowledge that it is more difficult to these 2 m were exposed to a faster concept drift than the predict a more distant future. other four ones. Finally, Table 6 shows the best configuration of algorithm, Table 5 compares the average validation MAPE for dif- model, and training window size across all 6 m for each ferent sizes of the validation window across various models, size of the validation window. The conclusion is that the meters, and training window sizes. It shows that on average, seasonal SWHSA algorithm works best for the 24- and 48-h Table 6 Best configuration for Val. window Min. Avg. MAPE Algorithm Model Train. window each prediction horizon (h) 24 9.532 SWHSA SARIMA(1,0,1) 2304 48 10.324 SWHSA SARIMA(0,0,1) 2304 72 10.787 SWDP2A ARIMA(1,0,1) 2304 96 9.047 SWDP2A ARIMA(0,1,1) 1152 123 Vietnam Journal of Computer Science (2018) 5:241–249 249 validation window sizes. For the two larger windows (48 and Acknowledgements This work was partially supported by the Israel Smart Grid (ISG) Consortium under the MAGNET Program, in the 72 h), the non-seasonal SWDPA algorithm induces the most office of the Chief Scientist of the Ministry of Economics in Israel. accurate forecasting models. The maximum training window size of 96 days (2304 h) is the best one for the first three Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecomm configurations. In case of the 96-h validation window (the ons.org/licenses/by/4.0/), which permits unrestricted use, distribution, fourth configuration), the best size of the training window is and reproduction in any medium, provided you give appropriate credit 48 days (1152 h). to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. 5 Discussion and conclusions References The main contribution of this paper is the introduction of sliding window-based forecasting algorithms (SWDP2A, 1. Alberg, D., Last, M.: Short-Term Load Forecasting in Smart Meters with Sliding Window-Based ARIMA Algorithms. ACIIDS2017, SWDPSA, SWH2A, SWHSA, and SWHSA) for electricity (pp. 299-307) (2017) load prediction in smart meters. These algorithms integrate 2. Depuru, S.S., Wang, L., Devabhaktuni, V.: Smart meters for power non-seasonal and seasonal time series (S)ARIMA models grid: Challenges, issues, advantages and status. Renewable and with the OLIN (online information network) incremental Sustainable Energy Reviews 15(6), 2736–2742 (2011) 3. Gama, J.: Knowledge Discovery from Data Streams. CRC Press, learning methodology. The main difference between the pre- Boca Raton (2010) sented algorithms concludes in seasonality adjustment and 4. Gerwig, C.: Short Term Load Forecasting for Residential Buildings the model construction phase. The non-seasonal SWH2A - an Extensive Literature Review, pp. 181–193. Intelligent Decision and seasonal SWHSA algorithms utilize hourly consump- Technologies, (2015) 5. Gerwig, C.: Short Term Load Forecasting for Residential Build- tion records in the training window, whereas non-seasonal ings.InJ.Hu, &e.al., Smart Grid Inspired Future Technologies: SWDP2A and seasonal SWDPSA algorithms utilize aggre- First International Conference, SmartGIFT 2016, Liverpool, UK, gated daily consumptions and average daily profiles of May 19-20, 2016, Revised Selected Papers (pp. 69-78). Springer International Publishing (2016) hourly consumptions to obtain the parameters of induced 6. Høverstad, B., Tidemann, A., Langseth, H.: Effects of Data Cleans- (S)ARIMA models. ing on Load Prediction Algorithms. In IEEE Symposium Series on The experimental data set was recorded online by state- Computational Intelligence 2013. IEEE (2013) of-the-art smart metering technology and, after thorough 7. Last, M.: Online Classification of Nonstationary Data Streams, pp. preprocessing, was approved for use in the corresponding 129–147. Intelligent Data Analysis, (2002) 8. Makridakis, S.G., Wheelwright, S.C., Hyndman, R.J.: Forecasting research experiments. The conducted experiments showed Methods and Applications. John Wiley & Sons, New York, NY that the SWDP2A algorithm outperforms the SW2SA algo- (2008) rithm, performs similarly to the seasonal SWHSA algorithm, 9. Penya, Y., Borges, C., Fernandez, I.: Short-term Load Forecast- and has more stable MAPE performance in terms of standard ing in Non-residential Buildings. AFRICON 2011,9(3), pp. 1-6. Livingstone (2011) deviation than the SWHSA algorithm. This remarkable find- 10. Veit, A., Goebel, C., Rohit, T., Doblander, C., Jacobsen, H.: House- ing indicates that the hourly prediction task does not require hold Electricity Demand Forecasting: Benchmarking State-of-the- collecting massive hourly data in the training phase of model Art Methods. 5th International Conference on Future Energy induction. It is sufficient to use daily consumption data and System (pp. 233-234). ACM (2014) aggregated hourly coefficients of daily profiles for obtaining accurate hourly predictions of electricity load. Publisher’s Note Springer Nature remains neutral with regard to juris- dictional claims in published maps and institutional affiliations.

Journal

Vietnam Journal of Computer ScienceSpringer Journals

Published: Jun 6, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off