Machine Learning Algorithms for Document Classification: Comparative Analysis

被引:0
|
作者
Rashid, Faizur [1 ]
Gargaare, Suleiman M. A. [2 ]
Aden, Abdulkadir H. [3 ]
Abdi, Afendi [4 ]
机构
[1] Haramaya Univ, Dept Comp Sci, Almaya, Ethiopia
[2] Univ Hargeisa, Dept Comp Sci, Hargeisa, Somalia
[3] Bule Hora Univ, Dept Comp Sci, Bule Hora, Ethiopia
[4] Haramaya Univ, Dept Software Engn, Almaya, Ethiopia
关键词
Document classification; machine learning algorithms; text classification; analysis;
D O I
10.14569/IJACSA.2022.0130430
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Automated document classification is the machine learning fundamental that refers to assigning automatic categories among scanned images of the documents. It reached the state-of-art stage but it needs to verify the performance and efficiency of the algorithm by comparing. The objective was to get the most efficient classification algorithms according to the usage of the fundamentals of science. Experimental methods were used by collecting data from a sum of 1080 students and researchers from Ethiopian universities and a meta-data set of Banknotes, Crowdsourced Mapping, and VxHeaven provided by UC Irvine. 25% of the respondents felt that KNN is better than the other models. The overall analysis of performance accuracies through various parameters namely accuracy percentage of 99.85%, the precision performance of 0.996, recall ratio of 100%, F-Score 0.997, classification time, and running time of KNN, SVM, Perceptron and Gaussian NB was observed. KNN performed better than the other classification algorithms with a fewer error rate of 0.0002 including the efficiency of the least classification time and running time with similar to 413 and 3.6978 microseconds consecutively. It is concluded by looking at all the parameters that KNN classifiers have been recognized as the best algorithm.
引用
收藏
页码:260 / 265
页数:6
相关论文
共 50 条
  • [1] Machine learning algorithms in microbial classification: a comparative analysis
    Wu, Yuandi
    Gadsden, S. Andrew
    [J]. FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2023, 6
  • [2] Comparative Analysis of Different Machine Learning Algorithms in Classification
    Wang, Lincong
    Xu, Weiwen
    Zhu, Zhenghao
    [J]. 2022 INTERNATIONAL CONFERENCE ON BIG DATA, INFORMATION AND COMPUTER NETWORK (BDICN 2022), 2022, : 257 - 263
  • [3] Comparative Analysis of Machine Learning Algorithms in Breast Cancer Classification
    Satish Chaurasiya
    Ranjit Rajak
    [J]. Wireless Personal Communications, 2023, 131 : 763 - 772
  • [4] Comparative Analysis of Machine Learning Algorithms for Audio Signals Classification
    Mahana, Poonam
    Singh, Gurbhej
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2015, 15 (06): : 49 - 55
  • [5] Comparative Analysis of Machine Learning Algorithms in Breast Cancer Classification
    Chaurasiya, Satish
    Rajak, Ranjit
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2023, 131 (02) : 763 - 772
  • [6] A Comparative Classification Analysis of Abdominal Aortic Aneurysms by Machine Learning Algorithms
    Balaji Rengarajan
    Wei Wu
    Crystal Wiedner
    Daijin Ko
    Satish C. Muluk
    Mark K. Eskandari
    Prahlad G. Menon
    Ender A. Finol
    [J]. Annals of Biomedical Engineering, 2020, 48 : 1419 - 1429
  • [7] Machine Learning Classification Algorithms for Phishing Detection: A Comparative Appraisal and Analysis
    Gana, Noah Ndakotsu
    Abdulhamid, Shafi'I Muhammad
    [J]. 2019 2ND INTERNATIONAL CONFERENCE OF THE IEEE NIGERIA COMPUTER CHAPTER (NIGERIACOMPUTCONF), 2019, : 19 - 26
  • [8] Comparative analysis of machine learning algorithms for the classification of underwater marine debris
    Jalil, B.
    Maggiani, L.
    Valcarenghi, L.
    [J]. 2023 IEEE INTERNATIONAL WORKSHOP ON METROLOGY FOR THE SEA; LEARNING TO MEASURE SEA HEALTH PARAMETERS, METROSEA, 2023, : 116 - 120
  • [9] A Comparative Classification Analysis of Abdominal Aortic Aneurysms by Machine Learning Algorithms
    Rengarajan, Balaji
    Wu, Wei
    Wiedner, Crystal
    Ko, Daijin
    Muluk, Satish C.
    Eskandari, Mark K.
    Menon, Prahlad G.
    Finol, Ender A.
    [J]. ANNALS OF BIOMEDICAL ENGINEERING, 2020, 48 (04) : 1419 - 1429
  • [10] Comparative analysis of image classification algorithms based on traditional machine learning and deep learning
    Wang, Pin
    Fan, En
    Wang, Peng
    [J]. PATTERN RECOGNITION LETTERS, 2021, 141 : 61 - 67