Research on Real-time Processing and Stream Analysis of Unstructured Data Based on Big Data Platforms

被引:0
|
作者
Liang, Huichao [1 ]
Wang, Di [1 ]
Liu, Yuan [1 ]
Mei, Lin [1 ]
Zhou, Mengxue [1 ]
Zhao, Haibin [1 ]
机构
[1] State Grid Henan, Informat & Telecommun Co Data Ctr, Zhengzhou 450000, Henan, Peoples R China
关键词
Big Data Platform; Unstructured Data; Real-time Processing; Stream Data;
D O I
10.1145/3662739.3665984
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rise of the big data era, massive streams of data are emerging in various fields such as the internet, Internet of Things, and finance, posing significant challenges to real-time processing. These data exhibit characteristics such as high velocity, randomness, and disorderliness, making it imperative to ensure the timeliness, stability, and correctness of systems. Therefore, this paper focuses on the stream processing of unstructured massive data on big data platforms and constructs a stream data capture and analysis architecture based on distributed processing mechanisms. Firstly, this paper designs a distributed stream data processing system based on network data capture, specifically targeting high-speed data processing requirements. By utilizing Pcap network data capture technology in conjunction with distributed computing systems, real-time high-speed data parsing is achieved. Meanwhile, this paper conducts a comparative analysis of two common distributed frameworks, establishing frameworks that support high-speed data storage and stream computing, thereby avoiding the problems of insufficient single-machine computing capacity and high costs of large-scale computers. Secondly, in the case where data from different business systems have different protocol contents, this paper proposes a generic protocol parsing method. By defining template writing rules and constructing corresponding protocol parsing template files, parsing of generic protocols is achieved. Through experimental validation, this paper verifies the feasibility of the proposed methods, providing important references for the real-time processing and stream analysis of unstructured data on big data platforms.
引用
下载
收藏
页码:96 / 101
页数:6
相关论文
共 50 条
  • [31] A Framework for Real-time Sentiment Analysis of Big Data Generated by Social Media Platforms
    Fahd, Kiran
    Parvin, Sazia
    de Souza-Daw, Anthony
    2021 31ST INTERNATIONAL TELECOMMUNICATION NETWORKS AND APPLICATIONS CONFERENCE (ITNAC), 2021, : 30 - 33
  • [32] Research on real-time network data mining technology for big data
    Jing Hu
    Xianbin Xu
    EURASIP Journal on Wireless Communications and Networking, 2019
  • [33] Analysis and prediction of big stream data in real-time water quality monitoring system
    Zhao, Jindong
    Wei, Shouke
    Wen, Xuebin
    Qiu, Xiuqin
    JOURNAL OF AMBIENT INTELLIGENCE AND SMART ENVIRONMENTS, 2020, 12 (05) : 393 - 406
  • [34] Development of a real-time framework for parallel data stream processing
    Kwon, Giil
    Hong, Jaesic
    FUSION ENGINEERING AND DESIGN, 2020, 157
  • [35] SpeedStream: A Real-Time Stream Data Processing Platform in The Cloud
    Li Zhao
    Zhang Chuang
    Xu Ke-fu
    2015 IEEE 34TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2015,
  • [36] Research on Real Time Processing and Intelligent Analysis Technology of Power Big Data
    Xue, Jiarui
    Chen, Xiangzhou
    Ding, Huixia
    He, Xiao
    INTERNATIONAL CONFERENCE ON BIG DATA AND INTERNET OF THINGS (BDIOT 2017), 2017, : 43 - 47
  • [37] On the use of IoT and Big Data Technologies for Real-time Monitoring and Data Processing
    Nait Maleka, Y.
    Kharbouch, A.
    El Khoukhi, H.
    Bakhouya, M.
    De Florio, V.
    El Ouadghiri, D.
    Latre, S.
    Blondia, C.
    8TH INTERNATIONAL CONFERENCE ON EMERGING UBIQUITOUS SYSTEMS AND PERVASIVE NETWORKS (EUSPN 2017) / 7TH INTERNATIONAL CONFERENCE ON CURRENT AND FUTURE TRENDS OF INFORMATION AND COMMUNICATION TECHNOLOGIES IN HEALTHCARE (ICTH-2017) / AFFILIATED WORKSHOPS, 2017, 113 : 429 - 434
  • [38] Stream Processing of Scientific Big Data on Heterogeneous Platforms - Image Analytics on Big Data in Motion
    Najmabadi, S. M.
    Klaiber, M.
    Wang, Z.
    Baroud, Y.
    Simon, S.
    2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 965 - 970
  • [39] Near real-time big-data processing for data driven applications
    Kampars, Janis
    Grabis, Janis
    2017 3RD INTERNATIONAL CONFERENCE ON BIG DATA INNOVATIONS AND APPLICATIONS (INNOVATE-DATA), 2017, : 35 - 42
  • [40] IoT and Big Data Technologies for Monitoring and Processing Real-Time Healthcare Data
    Kharbouch, Abdelhak
    Naitmalek, Youssef
    Elkhoukhi, Hamza
    Bakhouya, Mohamed
    De Florio, Vincenzo
    Driss El Ouadghiri, Moulay
    Latre, Steven
    Blondia, Chris
    INTERNATIONAL JOURNAL OF DISTRIBUTED SYSTEMS AND TECHNOLOGIES, 2019, 10 (04) : 17 - 30