Comparing linear discriminant analysis and supervised learning algorithms for binary classification-A method comparison study

被引:19
|
作者
Graf, Ricarda [1 ]
Zeldovich, Marina [2 ]
Friedrich, Sarah [1 ,3 ]
机构
[1] Univ Augsburg, Dept Math, Univ Str 14, D-86159 Augsburg, Germany
[2] Univ Med Ctr Gottingen, Inst Med Psychol & Med Sociol, Gottingen, Germany
[3] Univ Augsburg, Ctr Adv Analyt & Predict Sci CAAPS, Augsburg, Germany
关键词
binary classification; linear discriminant analysis; multivariate normality; simulation study; supervised learning; REGULARIZATION PATH; PREDICTION; MODELS; VALIDATION; SEARCH;
D O I
10.1002/bimj.202200098
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In psychology, linear discriminant analysis (LDA) is the method of choice for two-group classification tasks based on questionnaire data. In this study, we present a comparison of LDA with several supervised learning algorithms. In particular, we examine to what extent the predictive performance of LDA relies on the multivariate normality assumption. As nonparametric alternatives, the linear support vector machine (SVM), classification and regression tree (CART), random forest (RF), probabilistic neural network (PNN), and the ensemble k conditional nearest neighbor (EkCNN) algorithms are applied. Predictive performance is determined using measures of overall performance, discrimination, and calibration, and is compared in two reference data sets as well as in a simulation study. The reference data are Likert-type data, and comprise 5 and 10 predictor variables, respectively. Simulations are based on the reference data and are done for a balanced and an unbalanced scenario in each case. In order to compare the algorithms' performance, data are simulated from multivariate distributions with differing degrees of nonnormality. Results differ depending on the specific performance measure. The main finding is that LDA is always outperformed by RF in the bimodal data with respect to overall performance. Discriminative ability of the RF algorithm is often higher compared to LDA, but its model calibration is usually worse. Still LDA mostly ranges second in cases it is outperformed by another algorithm, or the differences are only marginal. In consequence, we still recommend LDA for this type of application.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] A Comparative Performance Analysis on Network Traffic classification using Supervised learning algorithms
    Archanaa, R.
    Athulya, V.
    Rajasundari, T.
    Kiran, Vamsee Krishna M.
    2017 4TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2017,
  • [32] A METHOD FOR SELECTING BETWEEN LINEAR AND QUADRATIC CLASSIFICATION MODELS IN DISCRIMINANT-ANALYSIS
    MESHBANE, A
    MORRIS, JD
    JOURNAL OF EXPERIMENTAL EDUCATION, 1995, 63 (03): : 263 - 273
  • [33] Assessment of computerized algorithms by comparing with human observers in binary classification tasks: a simulation study
    Yang, Yang
    Sahiner, Berkman
    Huang, Zhipeng
    Petrick, Nicholas
    Chen, Weijie
    MEDICAL IMAGING 2018: IMAGE PERCEPTION, OBSERVER PERFORMANCE, AND TECHNOLOGY ASSESSMENT, 2018, 10577
  • [34] Classification of goat genetic resources using morphological traits. Comparison of machine learning techniques with linear discriminant analysis
    Rodero, E.
    Gonzalez, A.
    Dorado-Moreno, M.
    Luque, M.
    Hervas, C.
    LIVESTOCK SCIENCE, 2015, 180 : 14 - 21
  • [35] Comparison of Linear Discriminant Analysis and Support Vector Machine in Classification of Subdural and Extradural Hemorrhages
    Tong, Hau-Lee
    Fauzi, Mohammad Faizal Ahmad
    Haw, Su-Cheng
    Ng, Hu
    SOFTWARE ENGINEERING AND COMPUTER SYSTEMS, PT 1, 2011, 179 : 723 - +
  • [36] Comparison of linear discriminant analysis methods for the classification of cancer based on gene expression data
    Huang, Desheng
    Quan, Yu
    He, Miao
    Zhou, Baosen
    JOURNAL OF EXPERIMENTAL & CLINICAL CANCER RESEARCH, 2009, 28
  • [37] Comparison of linear discriminant analysis methods for the classification of cancer based on gene expression data
    Desheng Huang
    Yu Quan
    Miao He
    Baosen Zhou
    Journal of Experimental & Clinical Cancer Research, 28
  • [38] Comparison between Linear Discriminant Analysis and Singular Value Decomposition for PD Gait Classification
    Ilias, Suryani
    Jailani, Rozita
    Tahir, Nooritawati Md
    ISCAIE 2015 - 2015 IEEE SYMPOSIUM ON COMPUTER APPLICATIONS AND INDUSTRIAL ELECTRONICS, 2015, : 142 - 146
  • [39] Performance Comparison of Supervised Machine Learning Algorithms for Multiclass Transient Classification in a Nuclear Power Plant
    Prusty, Manas Ranjan
    Chakraborty, Jaideep
    Jayanthi, T.
    Velusamy, K.
    SWARM, EVOLUTIONARY, AND MEMETIC COMPUTING, SEMCCO 2014, 2015, 8947 : 111 - 122
  • [40] Asymptotically Bias-Corrected Regularized Linear Discriminant Analysis for Cost-Sensitive Binary Classification
    Zollanvari, Amin
    Abdirash, Muratkhan
    Dadlani, Aresh
    Abibullaev, Berdakh
    IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (09) : 1300 - 1304