A Comprehensive Analysis of Classification Methods for Big Data Stream

被引:0
|
作者
Kaur, Amrinder [1 ]
Kumar, Rakesh [2 ]
机构
[1] Maharshi Dayanand Univ, Dept Comp Sci & Applicat, Rohtak, Haryana, India
[2] Kurukshetra Univ, Dept Comp Sci & Applicat, Thanesar, India
关键词
Big data; Classification; Naive Bayes; Hoeffding tree; BayesNet; Decision stump;
D O I
10.1007/978-981-15-0222-4_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional tools for mining the big data have become insufficient due to ever-growing data in the world. For handling big data, real-time and distributed processing is adopted. From so many mining tools options, it can be difficult for a researcher to opt an efficient tool. This paper is intended to aid the researcher who understands WEKA but is inexperienced with big data. The preliminary stage of data mining is classification, which categorizes the data into predefined groups. In this paper, WEKA with MOA package is used to classify big data stream with four different classifiers. Performance of these classifiers is analyzed on the basis of accuracy, i.e., correctly and incorrectly classified instances, time taken to test the model, and time taken to build the model. For this particular scenario, obtained results prove that naive Bayes is the most accurate classifier and decision stump is least effective classifier for big data.
引用
收藏
页码:213 / 222
页数:10
相关论文
共 50 条
  • [1] Data stream classification and big data analytics
    Krawczyk, Bartosz
    Wozniak, Michal
    Stefanowski, Jerzy
    [J]. NEUROCOMPUTING, 2015, 150 : 238 - 239
  • [2] Literature review and analysis on big data stream classification techniques
    Srivani, B.
    Sandhya, N.
    Rani, B. Padmaja
    [J]. INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2020, 24 (03) : 205 - 215
  • [3] Evolving Big Data Stream Classification with MapReduce
    Haque, Ahsanul
    Parker, Brandon
    Khan, Latifur
    Thuraisingham, Bhavani
    [J]. 2014 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2014, : 570 - 577
  • [4] A comprehensive analysis of the diverse aspects inherent to image data stream classification
    de Lima, Mateus C.
    Souza, Yan Stivaletti e
    Faria, Elaine R.
    Barioni, Maria Camila N.
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2022, 64 (08) : 2215 - 2238
  • [5] A comprehensive analysis of the diverse aspects inherent to image data stream classification
    Mateus C. de Lima
    YanStivalettie Souza
    Elaine R. Faria
    Maria Camila N. Barioni
    [J]. Knowledge and Information Systems, 2022, 64 : 2215 - 2238
  • [6] Comprehensive Analysis of Various Big Data Classification Techniques: A Challenging Overview
    Abdalla, Hemn Barzan
    Abuhaija, Belal
    [J]. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2023, 22 (01)
  • [7] Online Classification Algorithm for Uncertain Data Stream in Big Data
    Lyu Y.X.
    Wang C.R.
    Wang C.
    Yu C.Y.
    [J]. Lyu, Yan Xia (shaoqilyx@163.com), 1600, Northeast University (37): : 1245 - 1249
  • [8] Ensemble Methods for Spatial Data Stream Classification
    King, Liam
    Osborn, Wendy
    [J]. 18TH INTERNATIONAL CONFERENCE ON FUTURE NETWORKS AND COMMUNICATIONS, FNC 2023/20TH INTERNATIONAL CONFERENCE ON MOBILE SYSTEMS AND PERVASIVE COMPUTING, MOBISPC 2023/13TH INTERNATIONAL CONFERENCE ON SUSTAINABLE ENERGY INFORMATION TECHNOLOGY, SEIT 2023, 2023, 224 : 155 - 162
  • [9] A case study for performance analysis of big data stream classification using spark architecture
    Srivani, B.
    Sandhya, N.
    Rani, B. Padmaja
    [J]. INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2024, 15 (01) : 253 - 266
  • [10] A case study for performance analysis of big data stream classification using spark architecture
    B. Srivani
    N. Sandhya
    B. Padmaja Rani
    [J]. International Journal of System Assurance Engineering and Management, 2024, 15 : 253 - 266