Research on Real-time Processing and Stream Analysis of Unstructured Data Based on Big Data Platforms

被引:0
|
作者
Liang, Huichao [1 ]
Wang, Di [1 ]
Liu, Yuan [1 ]
Mei, Lin [1 ]
Zhou, Mengxue [1 ]
Zhao, Haibin [1 ]
机构
[1] State Grid Henan, Informat & Telecommun Co Data Ctr, Zhengzhou 450000, Henan, Peoples R China
关键词
Big Data Platform; Unstructured Data; Real-time Processing; Stream Data;
D O I
10.1145/3662739.3665984
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rise of the big data era, massive streams of data are emerging in various fields such as the internet, Internet of Things, and finance, posing significant challenges to real-time processing. These data exhibit characteristics such as high velocity, randomness, and disorderliness, making it imperative to ensure the timeliness, stability, and correctness of systems. Therefore, this paper focuses on the stream processing of unstructured massive data on big data platforms and constructs a stream data capture and analysis architecture based on distributed processing mechanisms. Firstly, this paper designs a distributed stream data processing system based on network data capture, specifically targeting high-speed data processing requirements. By utilizing Pcap network data capture technology in conjunction with distributed computing systems, real-time high-speed data parsing is achieved. Meanwhile, this paper conducts a comparative analysis of two common distributed frameworks, establishing frameworks that support high-speed data storage and stream computing, thereby avoiding the problems of insufficient single-machine computing capacity and high costs of large-scale computers. Secondly, in the case where data from different business systems have different protocol contents, this paper proposes a generic protocol parsing method. By defining template writing rules and constructing corresponding protocol parsing template files, parsing of generic protocols is achieved. Through experimental validation, this paper verifies the feasibility of the proposed methods, providing important references for the real-time processing and stream analysis of unstructured data on big data platforms.
引用
下载
收藏
页码:96 / 101
页数:6
相关论文
共 50 条
  • [41] A Research on Smart Tourism-Oriented Big Data Real-time Processing Technology
    Wei, Jin
    Ma, Lei
    Zhang, Zhongqiu
    2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, : 1848 - 1851
  • [42] Real-time big data processing for anomaly detection: A Survey
    Habeeb, Riyaz Ahamed Ariyaluran
    Nasaruddin, Fariza
    Gani, Abdullah
    Hashem, Ibrahim Abaker Targio
    Ahmed, Ejaz
    Imran, Muhammad
    INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2019, 45 : 289 - 307
  • [43] Real-time Dynamic Data Desensitization Method based on Data Stream
    Tian, Bing
    Lv, Shuqing
    Yin, Qilin
    Li, Ning
    Zhang, Yue
    Liu, Ziyan
    PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION SCIENCE AND SYSTEM, AISS 2019, 2019,
  • [44] Mapping the Big Data Landscape: Technologies, Platforms and Paradigms for Real-Time Analytics of Data Streams
    Dubuc, Timothee
    Stahl, Frederic
    Roesch, Etienne B.
    IEEE ACCESS, 2021, 9 : 15351 - 15374
  • [45] Research on High-Performance Real-time Data Analysis System Based on Spark Streaming in Big Data Environment
    Wang, Jialin
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 124 : 140 - 141
  • [46] Real-Time Big Data Analysis Architecture and Application
    Sharma, Nandani
    Agarwal, Manisha
    DATA SCIENCE AND BIG DATA ANALYTICS, 2019, 16 : 313 - 320
  • [47] A Knowledge-based Approach for Real-Time IoT Data Stream Annotation and Processing
    Kolozali, Sefki
    Bermudez-Edo, Maria
    Puschmann, Daniel
    Ganz, Frieder
    Barnaghi, Payam
    2014 IEEE International Conference (iThings) - 2014 IEEE International Conference on Green Computing and Communications (GreenCom) - 2014 IEEE International Conference on Cyber-Physical-Social Computing (CPS), 2014, : 215 - 222
  • [48] The growing role of integrated and insightful big and real-time data analytics platforms
    Ranganathan, Indrakumari
    Thangamuthu, Poongodi
    Palanimuthu, Suresh
    Balusamy, Balamurugan
    DIGITAL TWIN PARADIGM FOR SMARTER SYSTEMS AND ENVIRONMENTS: THE INDUSTRY USE CASES, 2020, 117 : 165 - 186
  • [49] Research on real-time networkdata mining technology for big data
    Hu, Jing
    Xu, Xianbin
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2019, 2019 (1)
  • [50] Real-time Data Analysis Model of Power Grid Equipment Based on Big Data Monitoring
    Shi, Yingbin
    Wang, Jie
    Hou, Bing
    Zhan, Zhongqiang
    2022 9TH INTERNATIONAL FORUM ON ELECTRICAL ENGINEERING AND AUTOMATION, IFEEA, 2022, : 705 - 708