Fuzzy Neighbors and Deep Learning-Assisted Spark Model for Imbalanced Classification of Big Data

被引:0
|
作者
Nalinipriya, G. [1 ]
Geetha, M. [2 ]
Sudha, D. [3 ]
Daniya, T. [4 ]
机构
[1] Saveetha Engn Coll, Dept Informat Technol, Chennai 602105, Tamil Nadu, India
[2] Chennai Inst Technol, Dept Comp Sci & Engn, Chennai, Tamil Nadu, India
[3] Meenakshi Coll Engn, Dept Comp Sci & Engn, Chennai 600078, Tamil Nadu, India
[4] GMR Inst Technol, Dept Informat Technol, Rajam 532127, Andhra Prades, India
关键词
Big data classification; spark architecture; Bird Swarm optimization; Deep Belief network; Deer hunting optimization; MAPREDUCE; FRAMEWORK;
D O I
10.1142/S0218488523500095
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Big data is important in knowledge manipulation, assessment, and prediction. However, extracting and analyzing knowledge through big database are complex because of imbalance data distribution that leads to wrong decisions and biased classification outputs. Hence, an effective and optimal big data classification approach is designed using the proposed Bird Swarm Deer Hunting Optimization-Deep Belief Network (BSDHO-based DBN) algorithm based on spark architecture that follows the master and slave nodes. The proposed BSDHO is obtained by combining Deer Hunting Optimization algorithm and Bird Swarm Algorithm. The developed model poses two nodes, namely slave and master node. The training data is initially given to the master node in the spark architecture to perform transformation of data. Here, the transformation of data is done with an exponential log kernel, and then selection of feature is done with sequential forward selecting for choosing suitable features for enhanced processing. Consequently, oversampling process is performed with Fuzzy K-Nearest Neighbor (Fuzzy KNN) in the slave node using selected features to manage imbalance data. Then, in master node, classification is done with Deep belief Network, and trained using developed Bird swarm Deer Hunting Optimization (BSDHO) algorithm. On the other hand, the test data is taken as input, and is fed to the slave node to perform data transformation. Then, the transformed data is given to the master node for classification based on the proposed BSDHO. At last, the training data and testing data output produced the classified output. The proposed BSDHO-based DBN provided enhanced outcomes with highest specificity of 97.92%, accuracy of 96.92%, and sensitivity of 96.9%.
引用
收藏
页码:141 / 162
页数:22
相关论文
共 50 条
  • [1] Multi-class imbalanced big data classification on Spark
    Sleeman, William C.
    Krawczyk, Bartosz
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 212
  • [2] Big data classification using deep learning and apache spark architecture
    Anilkumar V. Brahmane
    B. Chaitanya Krishna
    [J]. Neural Computing and Applications, 2021, 33 : 15253 - 15266
  • [3] Big data classification using deep learning and apache spark architecture
    Brahmane, Anilkumar, V
    Krishna, B. Chaitanya
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (22): : 15253 - 15266
  • [4] Deep Learning and Data Sampling with Imbalanced Big Data
    Johnson, Justin M.
    Khoshgoftaar, Taghi M.
    [J]. 2019 IEEE 20TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2019), 2019, : 175 - 183
  • [5] A Dynamic Spark-based Classification Framework for Imbalanced Big Data
    Abdel-Hamid, Nahla B.
    ElGhamrawy, Sally
    El Desouky, Ali
    Arafat, Hesham
    [J]. JOURNAL OF GRID COMPUTING, 2018, 16 (04) : 607 - 626
  • [6] A Dynamic Spark-based Classification Framework for Imbalanced Big Data
    Nahla B. Abdel-Hamid
    Sally ElGhamrawy
    Ali El Desouky
    Hesham Arafat
    [J]. Journal of Grid Computing, 2018, 16 : 607 - 626
  • [7] Deep Learning for Imbalanced Multimedia Data Classification
    Yan, Yilin
    Chen, Min
    Shyu, Mei-Ling
    Chen, Shu-Ching
    [J]. 2015 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2015, : 483 - 488
  • [8] Spark-based ensemble learning for imbalanced data classification
    Ding J.
    Wang S.
    Jia L.
    You J.
    Jiang Y.
    [J]. International Journal of Performability Engineering, 2018, 14 (05) : 945 - 964
  • [9] Chi-Spark-RS: an Spark-built Evolutionary Fuzzy Rule Selection Algorithm in Imbalanced Classification for Big Data Problems
    Fernandez, Alberto
    Almansa, Eva
    Herrera, Francisco
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2017,
  • [10] Improved multi-class classification approach for imbalanced big data on spark
    Singh, Tinku
    Khanna, Riya
    Satakshi
    Kumar, Manish
    [J]. JOURNAL OF SUPERCOMPUTING, 2023, 79 (06): : 6583 - 6611