Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

A distributed real-time data prediction framework for large-scale time-series data using stream processing

A distributed real-time data prediction framework for large-scale time-series data using stream... PurposeThe purpose of this paper is to propose a data prediction framework for scenarios which require forecasting demand for large-scale data sources, e.g., sensor networks, securities exchange, electric power secondary system, etc. Concretely, the proposed framework should handle several difficult requirements including the management of gigantic data sources, the need for a fast self-adaptive algorithm, the relatively accurate prediction of multiple time series, and the real-time demand.Design/methodology/approachFirst, the autoregressive integrated moving average-based prediction algorithm is introduced. Second, the processing framework is designed, which includes a time-series data storage model based on the HBase, and a real-time distributed prediction platform based on Storm. Then, the work principle of this platform is described. Finally, a proof-of-concept testbed is illustrated to verify the proposed framework.FindingsSeveral tests based on Power Grid monitoring data are provided for the proposed framework. The experimental results indicate that prediction data are basically consistent with actual data, processing efficiency is relatively high, and resources consumption is reasonable.Originality/valueThis paper provides a distributed real-time data prediction framework for large-scale time-series data, which can exactly achieve the requirement of the effective management, prediction efficiency, accuracy, and high concurrency for massive data sources. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png International Journal of Intelligent Computing and Cybernetics Emerald Publishing

A distributed real-time data prediction framework for large-scale time-series data using stream processing

Loading next page...
 
/lp/emerald-publishing/a-distributed-real-time-data-prediction-framework-for-large-scale-time-VE6JsOsFnW
Publisher
Emerald Publishing
Copyright
Copyright © Emerald Group Publishing Limited
ISSN
1756-378X
DOI
10.1108/IJICC-09-2016-0033
Publisher site
See Article on Publisher Site

Abstract

PurposeThe purpose of this paper is to propose a data prediction framework for scenarios which require forecasting demand for large-scale data sources, e.g., sensor networks, securities exchange, electric power secondary system, etc. Concretely, the proposed framework should handle several difficult requirements including the management of gigantic data sources, the need for a fast self-adaptive algorithm, the relatively accurate prediction of multiple time series, and the real-time demand.Design/methodology/approachFirst, the autoregressive integrated moving average-based prediction algorithm is introduced. Second, the processing framework is designed, which includes a time-series data storage model based on the HBase, and a real-time distributed prediction platform based on Storm. Then, the work principle of this platform is described. Finally, a proof-of-concept testbed is illustrated to verify the proposed framework.FindingsSeveral tests based on Power Grid monitoring data are provided for the proposed framework. The experimental results indicate that prediction data are basically consistent with actual data, processing efficiency is relatively high, and resources consumption is reasonable.Originality/valueThis paper provides a distributed real-time data prediction framework for large-scale time-series data, which can exactly achieve the requirement of the effective management, prediction efficiency, accuracy, and high concurrency for massive data sources.

Journal

International Journal of Intelligent Computing and CyberneticsEmerald Publishing

Published: Jun 12, 2017

References