Research Progress on Key Technologies Towards Real-time Stream Processing Applications

被引:0
|
作者
Xu Z.-Z. [1 ,2 ]
Xu C. [1 ,2 ,3 ]
Ding G.-Y. [1 ,2 ]
Chen Z.-H. [1 ,2 ]
Zhou A.-Y. [1 ,2 ]
机构
[1] School of Data Science and Engineering, East China Normal University, Shanghai
[2] Shanghai Engineering Research Center of Big Data Management, Shanghai
[3] Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin
来源
Ruan Jian Xue Bao/Journal of Software | 2024年 / 35卷 / 01期
关键词
data processing system; real-time processing; stream processing;
D O I
10.13328/j.cnki.jos.006917
中图分类号
学科分类号
摘要
In order to perform knowledge mining and management, information systems need to process various forms of data, including stream data. Stream data have the characteristics of large data scale, fast generation speed, and strong timeliness of the knowledge contained in them. Therefore, it is very important for knowledge management of information systems to develop stream processing technology that supports real-time stream processing applications. Stream processing systems (SPSs) can be traced back to the 1990s, and they have undergone significant development since then. However, current diverse knowledge management needs and the new generation of hardware architectures have brought new challenges and opportunities for SPSs, and a series of technical research on stream processing ensues. This study introduces the basic requirements and development history of SPSs and then analyzes relevant technologies in the SPS field in terms of four aspects: programming interface, execution plan, resource scheduling, and fault tolerance. Finally, this study predicts the research directions and development trends of stream processing technology in the future. © 2024 Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:430 / 454
页数:24
相关论文
共 65 条
  • [1] Cui B, Gao J, Tong YX, Xu JQ, Zhang DX, Zou L., Progress and trend in novel data management system, Ruan Jian Xue Bao/Journal of Software, 30, 1, pp. 164-193, (2019)
  • [2] Li S, Huang YZ, Chen HY., Review of big data stream computing system study, Journal of Information Engineering University, 17, 1, pp. 88-92, (2016)
  • [3] Qi KY, Zhao ZF, Fang J, Ma Q., Real-time processing for high speed data stream over large scale data, Chinese Journal of Computers, 35, 3, pp. 477-490, (2012)
  • [4] Wu F, Lu ZQ, Zhao WY., Application of real-time flow computation in insurance decision system, Computer and Digital Engineering, 48, 6, pp. 1324-1327, (2020)
  • [5] Terry D, Goldberg D, Nichols D, Oki B., Continuous queries over append-only databases, ACM SIGMOD Record, 21, 2, pp. 321-330, (1992)
  • [6] Cranor C, Johnson T, Spataschek O, Shkapenyuk V., GigaScope: A stream database for network applications, Proc. of the 2003 ACM SIGMOD Int’l Conf. on Management of Data, pp. 647-651, (2003)
  • [7] Abadi DJ, Carney D, Cetintemel U, Cherniack M, Convey C, Lee S, Stonebraker M, Tatbul N, Zdonik S., Aurora: A new model and architecture for data stream management, The VLDB Journal, 12, 2, pp. 120-139, (2003)
  • [8] Dean J, Ghemawat S., MapReduce: Simplified data processing on large clusters, Communications of the ACM, 51, 1, pp. 107-113, (2008)
  • [9] Toshniwal A, Taneja S, Shukla A, Ramasamy K, Patel JM, Kulkarni S, Jackson J, Gade K, Fu MS, Donham J, Bhagat N, Mittal S, Ryaboy D., Storm@Twitter, Proc. of the 2014 ACM SIGMOD Int’l Conf. on Management of Data, pp. 147-156, (2014)
  • [10] Zaharia M, Das T, Li HY, Hunter T, Shenker S, Stoica I., Discretized streams: A fault-tolerant model for scalable stream processing, (2012)