Feature selection from disaster tweets using Spark-based parallel meta-heuristic optimizers

被引:0
|
作者
Mohammed Ahsan Raza Noori
Bharti Sharma
Ritika Mehra
机构
[1] DIT University, School of Computing
[2] Dev Bhoomi Uttarakhand University,School of Engineering & Computing
来源
关键词
Feature selection; Parallel meta-heuristic optimization; Apache Spark; Disaster; Twitter; Classification;
D O I
暂无
中图分类号
学科分类号
摘要
Twitter is considered a useful tool for effective tracking and management of disaster-related incidents. However, due to a large number of irrelevant features in textual data, the problem of high dimensionality arises which eventually increases the computational cost and also decreases the classification performance. Thus to handle such type of problem, this work presents Spark-BGWO and Spark-BWOA, an Apache Spark-based parallel implementation of two nature inspired meta-heuristic optimizers, binary gray wolf optimization (BGWO) and binary whale optimization algorithm (BWOA) for optimal feature selection and classification of disaster tweets. Random forests (RF) classifier is applied during wrapper-based feature subset selection and classification process. The performance of proposed optimizers was analyzed on seven benchmark disaster tweet datasets, namely California Wildfires, Hurricane Harvey, Hurricane Irma, Hurricane Maria, Iraq–Iran Earthquake, Mexico Earthquake, and Sri Lanka Floods, and then results were compared with the most recent work on the same datasets. Results showed that both optimizers performed competently in feature selection and classification process, as well as outperform the results of previous work over five out of seven datasets in terms of accuracy and F1-score.
引用
下载
收藏
相关论文
共 50 条
  • [41] Framework of Meta-Heuristic Variable Length Searching for Feature Selection in High-Dimensional Data
    Saraf, Tara Othman Qadir
    Fuad, Norfaiza
    Taujuddin, Nik Shahidah Afifi Md
    COMPUTERS, 2023, 12 (01)
  • [42] A parallel and distributed meta-heuristic framework based on partially ordered knowledge sharing
    Kim, Jinwoo
    Kim, Minyoung
    Stehr, Mark-Oliver
    Oh, Hyunok
    Ha, Soonhoi
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2012, 72 (04) : 564 - 578
  • [43] NSGA-II-XGB: Meta-heuristic feature selection with XGBoost framework for diabetes prediction
    Gupta, Aditya
    Rajput, Ishwari Singh
    Gunjan
    Jain, Vibha
    Chaurasia, Soni
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (21):
  • [44] A Meta-Analysis Survey on the Usage of Meta-Heuristic Algorithms for Feature Selection on High-Dimensional Datasets
    Yab, Li Yu
    Wahid, Noorhaniza
    Hamid, Rahayu A.
    IEEE ACCESS, 2022, 10 : 122832 - 122856
  • [45] Enabling data security in data using vertical split with parallel feature selection using meta heuristic algorithms
    Senthamil Selvi, R.
    Valarmathi, M. L.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (03):
  • [46] A Hybrid Meta-Heuristic Feature Selection Method Using Golden Ratio and Equilibrium Optimization Algorithms for Speech Emotion Recognition
    Dey, Arijit
    Chattopadhyay, Soham
    Singh, Pawan Kumar
    Ahmadian, Ali
    Ferrara, Massimiliano
    Sarkar, Ram
    IEEE ACCESS, 2020, 8 : 200953 - 200970
  • [47] Feature selection based bee swarm meta-heuristic approach for combinatorial optimisation problems: a case-study on MaxSAT
    Sadeg, Souhila
    Hamdad, Leila
    Chettab, Hadjer
    Benatchba, Karima
    Habbas, Zineb
    Kechadi, M-Tahar
    MEMETIC COMPUTING, 2020, 12 (04) : 283 - 298
  • [48] Tackling Ant Colony Optimization Meta-Heuristic as Search Method in Feature Subset Selection Based on Correlation or Consistency Measures
    Tallon-Ballesteros, Antonio J.
    Riquelme, Jose C.
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2014, 2014, 8669 : 386 - 393
  • [49] Feature selection based bee swarm meta-heuristic approach for combinatorial optimisation problems: a case-study on MaxSAT
    Souhila Sadeg
    Leila Hamdad
    Hadjer Chettab
    Karima Benatchba
    Zineb Habbas
    M-Tahar Kechadi
    Memetic Computing, 2020, 12 : 283 - 298
  • [50] A Meta-Heuristic Algorithm-Based Feature Selection Approach to Improve Prediction Success for Salmonella Occurrence in Agricultural Waters
    Demir, Murat
    Canayaz, Murat
    Topalcengiz, Zeynal
    JOURNAL OF AGRICULTURAL SCIENCES-TARIM BILIMLERI DERGISI, 2024, 30 (01): : 118 - 130