A comparative evaluation of machine learning and deep learning algorithms for question categorization of VQA datasets

被引:1
|
作者
Asudani, Deepak Suresh [1 ]
Nagwani, Naresh Kumar [1 ]
Singh, Pradeep [1 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Raipur, Chhattisgarh, India
关键词
Question Classification; Machine Learning; Deep Learning; SMOTE; BERT-based Transformers;
D O I
10.1007/s11042-023-17797-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Question classification primarily involves categorizing questions based on the type of answer, with less emphasis on the words or phrases used to form the query. Question classification is crucial in the Visual Question Answering (VQA) system, and the dataset's quality plays an essential role in the system's development. The available question categorization in the VQA and TDIUC datasets shows imbalance, and the VQA model trained on imbalanced datasets performs poorly in handling language-prior problems, failing to categorize questions, and predicting incorrect outcomes. Therefore, developing a better classification method for classifying questions into appropriate categories based on phrases is necessary. This paper examines the effectiveness of the synthetic minority oversampling technique (SMOTE) in addressing the class imbalance problem within the question classification task using the LSTM, selected machine learning models and BERT-based transformer model. The preprocessing and analysis module efficiently categorizes input question sets by identifying valuable phrases and obtaining an evenly distributed dataset based on question categories from both datasets. The performance evaluation of Naive Bayes, SVM, Random Forests, and XGBoost models shows that the XGBoost model outperforms other selected classifiers, and the LSTM model achieves higher accuracy but requires more computation time. The empirical assessment indicates that the BERT-based transformer model exceeds the traditional models employed for comparison. The ablation study also reveals that utilizing SMOTE techniques for question classification tasks achieves slightly improved accuracy at the expense of higher computation time and resources. It is concluded that the BERT-based transformer model efficiently and precisely performs question classification tasks.
引用
收藏
页码:57829 / 57859
页数:31
相关论文
共 50 条
  • [21] Ethical challenges of machine learning and deep learning algorithms
    Prabhu, Sanjay P.
    [J]. LANCET ONCOLOGY, 2019, 20 (05): : 621 - 622
  • [22] A Review on Cyber Security Datasets for Machine Learning Algorithms
    Yavanoglu, Ozlem
    Aydos, Murat
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2186 - 2193
  • [23] Anomaly Detection in ICS Datasets with Machine Learning Algorithms
    Mubarak, Sinil
    Habaebi, Mohamed Hadi
    Islam, Md Rafiqul
    Rahman, Farah Diyana Abdul
    Tahir, Mohammad
    [J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2021, 37 (01): : 33 - 46
  • [24] Comparative analysis of Deep Learning and Machine Learning algorithms for emoji prediction from Arabic text
    Takua Mokhamed
    Saad Harous
    Nada Hussein
    Heba Ismail
    [J]. Social Network Analysis and Mining, 14
  • [25] A comparative study of machine learning and deep learning algorithms for predicting student's academic performance
    Bhushan, Megha
    Vyas, Satyam
    Mall, Shrey
    Negi, Arun
    [J]. INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2023, 14 (06) : 2674 - 2683
  • [26] A comparative study of machine learning and deep learning algorithms for predicting student’s academic performance
    Megha Bhushan
    Satyam Vyas
    Shrey Mall
    Arun Negi
    [J]. International Journal of System Assurance Engineering and Management, 2023, 14 : 2674 - 2683
  • [27] Comparative analysis of Deep Learning and Machine Learning algorithms for emoji prediction from Arabic text
    Mokhamed, Takua
    Harous, Saad
    Hussein, Nada
    Ismail, Heba
    [J]. SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
  • [28] Comparative study and analysis on skin cancer detection using machine learning and deep learning algorithms
    V. Auxilia Osvin Nancy
    P. Prabhavathy
    Meenakshi S. Arya
    B. Shamreen Ahamed
    [J]. Multimedia Tools and Applications, 2023, 82 : 45913 - 45957
  • [29] Comparative study and analysis on skin cancer detection using machine learning and deep learning algorithms
    Nancy, V. Auxilia Osvin
    Prabhavathy, P.
    Arya, Meenakshi S.
    Ahamed, B. Shamreen
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (29) : 45913 - 45957
  • [30] Machine Learning Metrics for Network Datasets Evaluation
    Soukup, Dominik
    Uhricek, Daniel
    Vasata, Daniel
    Cejka, Tomas
    [J]. ICT SYSTEMS SECURITY AND PRIVACY PROTECTION, IFIP SEC 2023, 2024, 679 : 307 - 320