TY - GEN
T1 - A deep recurrent neural network based predictive control framework for reliable distributed stream data processing
AU - Xu, Jielong
AU - Tang, Jian
AU - Xu, Zhiyuan
AU - Yin, Chengxiang
AU - Kwiat, Kevin
AU - Kamhoua, Charles
N1 - Publisher Copyright:
© 2019 IEEE
PY - 2019/5
Y1 - 2019/5
N2 - In this paper, we present design, implementation and evaluation of a novel predictive control framework to enable reliable distributed stream data processing, which features a Deep Recurrent Neural Network (DRNN) model for performance prediction, and dynamic grouping for flexible control. Specifically, we present a novel DRNN model, which makes accurate performance prediction with careful consideration for interference of co-located worker processes, according to multilevel runtime statistics. Moreover, we design a new grouping method, dynamic grouping, which can distribute/re-distribute data tuples to downstream tasks according to any given split ratio on the fly. So it can be used to re-direct data tuples to bypass misbehaving workers. We implemented the proposed framework based on a widely used Distributed Stream Data Processing System (DSDPS), Storm. For validation and performance evaluation, we developed two representative stream data processing applications: Windowed URL Count and Continuous Queries. Extensive experimental results show: 1) The proposed DRNN model outperforms widely used baseline solutions, ARIMA and SVR, in terms of prediction accuracy; 2) dynamic grouping works as expected; and 3) the proposed framework enhances reliability by offering minor performance degradation with misbehaving workers.
AB - In this paper, we present design, implementation and evaluation of a novel predictive control framework to enable reliable distributed stream data processing, which features a Deep Recurrent Neural Network (DRNN) model for performance prediction, and dynamic grouping for flexible control. Specifically, we present a novel DRNN model, which makes accurate performance prediction with careful consideration for interference of co-located worker processes, according to multilevel runtime statistics. Moreover, we design a new grouping method, dynamic grouping, which can distribute/re-distribute data tuples to downstream tasks according to any given split ratio on the fly. So it can be used to re-direct data tuples to bypass misbehaving workers. We implemented the proposed framework based on a widely used Distributed Stream Data Processing System (DSDPS), Storm. For validation and performance evaluation, we developed two representative stream data processing applications: Windowed URL Count and Continuous Queries. Extensive experimental results show: 1) The proposed DRNN model outperforms widely used baseline solutions, ARIMA and SVR, in terms of prediction accuracy; 2) dynamic grouping works as expected; and 3) the proposed framework enhances reliability by offering minor performance degradation with misbehaving workers.
KW - Deep Learning
KW - Distributed Stream Data Processing
KW - Prediction
KW - Recurrent Neural Network
KW - Storm
UR - http://www.scopus.com/inward/record.url?scp=85072821698&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85072821698&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2019.00036
DO - 10.1109/IPDPS.2019.00036
M3 - Conference contribution
AN - SCOPUS:85072821698
T3 - Proceedings - 2019 IEEE 33rd International Parallel and Distributed Processing Symposium, IPDPS 2019
SP - 262
EP - 272
BT - Proceedings - 2019 IEEE 33rd International Parallel and Distributed Processing Symposium, IPDPS 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 33rd IEEE International Parallel and Distributed Processing Symposium, IPDPS 2019
Y2 - 20 May 2019 through 24 May 2019
ER -