Comparing linear discriminant analysis and supervised learning algorithms for binary classification-A method comparison study

被引:19
|
作者
Graf, Ricarda [1 ]
Zeldovich, Marina [2 ]
Friedrich, Sarah [1 ,3 ]
机构
[1] Univ Augsburg, Dept Math, Univ Str 14, D-86159 Augsburg, Germany
[2] Univ Med Ctr Gottingen, Inst Med Psychol & Med Sociol, Gottingen, Germany
[3] Univ Augsburg, Ctr Adv Analyt & Predict Sci CAAPS, Augsburg, Germany
关键词
binary classification; linear discriminant analysis; multivariate normality; simulation study; supervised learning; REGULARIZATION PATH; PREDICTION; MODELS; VALIDATION; SEARCH;
D O I
10.1002/bimj.202200098
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In psychology, linear discriminant analysis (LDA) is the method of choice for two-group classification tasks based on questionnaire data. In this study, we present a comparison of LDA with several supervised learning algorithms. In particular, we examine to what extent the predictive performance of LDA relies on the multivariate normality assumption. As nonparametric alternatives, the linear support vector machine (SVM), classification and regression tree (CART), random forest (RF), probabilistic neural network (PNN), and the ensemble k conditional nearest neighbor (EkCNN) algorithms are applied. Predictive performance is determined using measures of overall performance, discrimination, and calibration, and is compared in two reference data sets as well as in a simulation study. The reference data are Likert-type data, and comprise 5 and 10 predictor variables, respectively. Simulations are based on the reference data and are done for a balanced and an unbalanced scenario in each case. In order to compare the algorithms' performance, data are simulated from multivariate distributions with differing degrees of nonnormality. Results differ depending on the specific performance measure. The main finding is that LDA is always outperformed by RF in the bimodal data with respect to overall performance. Discriminative ability of the RF algorithm is often higher compared to LDA, but its model calibration is usually worse. Still LDA mostly ranges second in cases it is outperformed by another algorithm, or the differences are only marginal. In consequence, we still recommend LDA for this type of application.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Supervised Learning Approach towards Class Separability- Linear Discriminant Analysis
    Pathak, Anjali
    Vohra, Bhawna
    Gupta, Kapil
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 1088 - 1093
  • [22] Weighted linear programming discriminant analysis for high-dimensional binary classification
    Wu, Yufei
    Yu, Guan
    STATISTICAL ANALYSIS AND DATA MINING, 2020, 13 (05) : 437 - 450
  • [23] Empirical comparison of the classification performance of robust linear and quadratic discriminant analysis
    Joossens, K
    Croux, C
    THEORY AND APPLICATION OF RECENT ROBUST METHODS, 2004, : 131 - 140
  • [24] An Experimental Comparison of Semi-supervised Learning Algorithms for Multispectral Image Classification
    Tu, Enmei
    Yang, Jie
    Fang, Jiangxiong
    Jia, Zhenghong
    Kasabov, Nikola
    PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING, 2013, 79 (04): : 347 - 357
  • [25] Comparing the Linear and Quadratic Discriminant Analysis of Diabetes Disease Classification Based on Data Multicollinearity
    Araveeporn, Autcha
    INTERNATIONAL JOURNAL OF MATHEMATICS AND MATHEMATICAL SCIENCES, 2022, 2022
  • [26] Combined 5 x 2 cv F test for comparing supervised classification learning algorithms
    Alpaydin, E
    NEURAL COMPUTATION, 1999, 11 (08) : 1885 - 1892
  • [27] A fuzzy supervised learning method with dynamical parameter estimation for nonlinear discriminant analysis
    Song, Xiaoning
    Liu, Zi
    Yang, Xibei
    Yang, Jingyu
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2013, 66 (10) : 1782 - 1794
  • [28] Domain decomposed classification algorithms based on linear discriminant analysis: An optimality theory and applications
    Li, Jingwei
    Cai, Xiao-Chuan
    NEUROCOMPUTING, 2024, 575
  • [29] Poverty classification based on unsatisfied basic needs index: a comparison of supervised learning algorithms
    Salmaan Ansari
    Murali Dhar
    SN Social Sciences, 2 (5):
  • [30] Comparison of Supervised Learning Image Classification Algorithms for Food and Non-Food Objects
    Yogaswara, Reza Dea
    Wibawa, Adhi Dharma
    2018 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING, NETWORK AND INTELLIGENT MULTIMEDIA (CENIM), 2018, : 317 - 324