The Effect of Preprocessing Techniques, Applied to Numeric Features, on Classification Algorithms' Performance

被引:37
|
作者
Alshdaifat, Esra'a [1 ]
Alshdaifat, Doa'a [1 ]
Alsarhan, Ayoub [1 ]
Hussein, Fairouz [1 ]
El-Salhi, Subhieh Moh'd Faraj S. [1 ]
机构
[1] Hashemite Univ, Fac Prince Al Hussein Bin Abdallah II Informat Te, Dept Comp Informat Syst, POB 330127, Zarqa 13133, Jordan
关键词
preprocessing; classification algorithms; normalization; missing values; classification performance; data cleaning;
D O I
10.3390/data6020011
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It is recognized that the performance of any prediction model is a function of several factors. One of the most significant factors is the adopted preprocessing techniques. In other words, preprocessing is an essential process to generate an effective and efficient classification model. This paper investigates the impact of the most widely used preprocessing techniques, with respect to numerical features, on the performance of classification algorithms. The effect of combining various normalization techniques and handling missing values strategies is assessed on eighteen benchmark datasets using two well-known classification algorithms and adopting different performance evaluation metrics and statistical significance tests. According to the reported experimental results, the impact of the adopted preprocessing techniques varies from one classification algorithm to another. In addition, a statistically significant difference between the considered data preprocessing techniques is demonstrated.
引用
收藏
页码:1 / 23
页数:23
相关论文
共 50 条
  • [41] Classification of Short Text Using Various Preprocessing Techniques: An Empirical Evaluation
    Kumar, H. M. Keerthi
    Harish, B. S.
    RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 3, 2018, 709 : 19 - 30
  • [42] Perceptual preprocessing techniques applied to video compression: Some result elements and analysis
    Marquant, G
    DCC 2002: DATA COMPRESSION CONFERENCE, PROCEEDINGS, 2002, : 463 - 463
  • [43] Survey on Features and Classification Techniques in Music Genre Classification
    Patil, Swati A.
    Rao, K. Thirupathi
    Patil, Sonal
    HELIX, 2018, 8 (05): : 3833 - 3837
  • [44] Evaluation of the preprocessing and training stages in text classification algorithms in the context of information retrieval
    Sathler Guimaraes, Lucas Marques
    Gouvea Meireles, Magali Rezende
    Maciel de Almeida, Paulo Eduardo
    PERSPECTIVAS EM CIENCIA DA INFORMACAO, 2019, 24 (01): : 169 - 190
  • [45] Comparison of Classification Algorithms for Various Methods of Preprocessing Radar Images of the MSTAR Base
    Borodinov, A. A.
    Myasnikov, V. V.
    TENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2017), 2018, 10696
  • [46] Discrimination and classification techniques applied on Mallotus and Phyllanthus high performance liquid chromatography fingerprints
    Viaene, J.
    Goodarzi, M.
    Dejaegher, B.
    Tistaert, C.
    Le Tuan, A. Hoang
    Hoai, N. Nguyen
    Van, M. Chau
    Quetin-Leclercq, J.
    Heyden, Y. Vander
    ANALYTICA CHIMICA ACTA, 2015, 877 : 41 - 50
  • [47] Comparison of classification techniques applied for network intrusion detection and classification
    Aziz, Amira Sayed A.
    EL-Ola Hanafi, Sanaa
    Hassanien, Aboul Ella
    JOURNAL OF APPLIED LOGIC, 2017, 24 : 109 - 118
  • [48] Comparative Performance Analysis of State-of-the-Art Classification Algorithms Applied to Lung Tissue Categorization
    Adrien Depeursinge
    Jimison Iavindrasana
    Asmâa Hidki
    Gilles Cohen
    Antoine Geissbuhler
    Alexandra Platon
    Pierre-Alexandre Poletti
    Henning Müller
    Journal of Digital Imaging, 2010, 23 : 18 - 30
  • [49] Comparative Performance Analysis of State-of-the-Art Classification Algorithms Applied to Lung Tissue Categorization
    Depeursinge, Adrien
    Iavindrasana, Jimison
    Hidki, Asmaa
    Cohen, Gilles
    Geissbuhler, Antoine
    Platon, Alexandra
    Poletti, Pierre-Alexandre
    Mueller, Henning
    JOURNAL OF DIGITAL IMAGING, 2010, 23 (01) : 18 - 30
  • [50] Expeditious Dynamic Clustering in Preprocessing for High-Performance Classification
    Pimpa, Anamika
    Eiamkanitchat, Narissara
    JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2024, 2024