Load prediction based elastic resource scheduling strategy in Flink

被引:0
|
作者
Li Z. [1 ,2 ]
Yu J. [1 ,2 ]
Wang Y. [3 ]
Bian C. [4 ]
Pu Y. [2 ]
Zhang Y. [1 ]
Liu Y. [1 ]
机构
[1] School of Software, Xinjiang University, Urumqi
[2] School of Information Science and Engineering, Xinjiang University, Urumqi
[3] College of Computer Science, Chengdu University, Chengdu
[4] College of Internet Finance and Information Engineering, Guangdong University of Finance, Guangzhou
来源
基金
中国国家自然科学基金;
关键词
Flink; Load prediction; Performance bottleneck; Resource scheduling; Stream computing;
D O I
10.11959/j.issn.1000-436x.2020195
中图分类号
学科分类号
摘要
In order to solve the problem that the load of big data stream computing platform fluctuates drastically while the cluster was suffering from the performance bottleneck due to the shortage of computing resources, the load prediction based elastic resource scheduling strategy in Flink (LPERS-Flink) was proposed. Firstly, the load prediction model was set up as the foundation to propose the load prediction algorithm and predict the variation tendency of the processing load. Secondly, the resource judgment model was set up to identify the performance bottleneck and resource redundancy of the cluster while the resource scheduling algorithm was proposed to draw up the resource rescheduling plan. Finally, the online load migration algorithm was proposed to execute the resource rescheduling plan and migrate processing load among nodes efficiently. The experimental results show that the strategy provides better performance promotion in the application with drastically fluctuating processing load. The scale and resource configuration of the cluster responded to the variation of processing load in time and the communication overhead of the load migration was reduced effectively. © 2020, Editorial Board of Journal on Communications. All right reserved.
引用
收藏
页码:92 / 108
页数:16
相关论文
共 38 条
  • [1] PENG A N, ZHOU W, JIA Y, Et al., Survey of the Internet of things operating system security, Journal on Communications, 39, 3, pp. 22-34, (2018)
  • [2] DEAN J, GHEMAWAT S., MapReduce: simplified data processing on large clusters, Communications of the ACM, 51, 1, pp. 107-113, (2008)
  • [3] BIAN C, YU J, XIU W R, Et al., Progressive filling partitioning and mapping algorithm for Spark based on allocation fitness degree, Journal on Communications, 38, 9, pp. 133-147, (2017)
  • [4] BIAN C, YU J, XIU W R, Et al., Partial data shuffled first strategy for in-memory computing framework, Journal of Computer Research and Development, 54, 4, pp. 787-803, (2017)
  • [5] SUN D W, ZHANG G Y, ZHENG W M., Big data stream computing: technologies and instances, Journal of Software, 25, 4, pp. 839-862, (2014)
  • [6] ALEXANDROVE A, BERGMANN R, EWEN S, Et al., The stratosphere platform for big data analytics, The VLDB Journal, 23, 6, pp. 939-964, (2014)
  • [7] CARBONE P, KATSIFODIMOS A, EWEN S, Et al., Apache Flink: stream and batch processing in a single engine, Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, 36, 4, pp. 28-38, (2015)
  • [8] TOSHNIWAL A, TANEJA S, SHUKLA A, Et al., Storm @Twitter, The 2014 ACM SIGMOD International Conference on Management of Data, pp. 147-156, (2014)
  • [9] CARBONE P, EWEN S, FORA G, Et al., State management in Apache Flink®: consistent stateful distributed stream processing, Proceedings of the VLDB Endowment, 10, 12, pp. 1718-1729, (2017)
  • [10] PARIS C, GYULA F, STEPHAN E, Et al., Lightweight asynchronous snapshots for distributed dataflows, Computer Science, (2015)