Model selection for linear classifiers using Bayesian error estimation

被引：13

作者：

Huttunen, Heikki ^{[1
]}

Tohka, Jussi ^{[2
,3
]}

机构：

[1] Tampere Univ Technol, Dept Signal Proc, FIN-33101 Tampere, Finland

[2] Univ Carlos III Madrid, Dept Bioengn & Aerosp Engn, E-28903 Getafe, Spain

[3] Inst Invest Sanitaria Gregorio Maranon, Madrid, Spain

来源：

PATTERN RECOGNITION | 2015年 / 48卷 / 11期

关键词：

Logistic regression; Support vector machine; Regularization; Bayesian error estimator; Linear classifier; CLASSIFICATION; PERFORMANCE;

D O I：

10.1016/j.patcog.2015.05.005

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Regularized linear models are important classification methods for high dimensional problems, where regularized linear classifiers are often preferred due to their ability to avoid overfitting. The degree of freedom of the model dis determined by a regularization parameter, which is typically selected using counting based approaches, such as K-fold cross-validation. For large data, this can be very time consuming, and, for small sample sizes, the accuracy of the model selection is limited by the large variance of CV error estimates. In this paper, we study the applicability of a recently proposed Bayesian error estimator for the selection of the best model along the regularization path. We also propose an extension of the estimator that allows model selection in multiclass cases and study its efficiency with L-1 regularized logistic regression and L-2 regularized linear support vector machine. The model selection by the new Bayesian error estimator is experimentally shown to improve the classification accuracy, especially in small sample-size situations, and is able to avoid the excess variability inherent to traditional cross-validation approaches. Moreover, the method has significantly smaller computational complexity than cross-validation. (C) 2015 Elsevier Ltd. All rights reserved.

引用

页码：3739 / 3748

页数：10

共 50 条

[1] Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression
Neath, Andrew A.
Cavanaugh, Joseph E.
[J]. INTERNATIONAL STATISTICAL REVIEW, 2010, 78 (02) : 257 - 270
[2] BAYESIAN ERROR ESTIMATION AND MODEL SELECTION IN SPARSE LOGISTIC REGRESSION
Huttunen, Heikki
Manninen, Tapio
Tohka, Jussi
[J]. 2013 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2013,
[3] Bayesian estimation and model selection of multivariate linear model with polytomous variables
Song, XY
Lee, SY
[J]. MULTIVARIATE BEHAVIORAL RESEARCH, 2002, 37 (04) : 453 - 477
[4] Bayesian bandwidth estimation and semi-metric selection for a functional partial linear model with unknown error density
Shang, Han Lin
[J]. JOURNAL OF APPLIED STATISTICS, 2021, 48 (04) : 583 - 604
[5] OPTIMUM CONTROL OF AN UNKNOWN LINEAR PLANT USING BAYESIAN ESTIMATION OF ERROR
SPANG, HA
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1965, AC10 (01) : 80 - &
[6] Bayesian estimation and model selection for spatial Durbin error model with finite distributed lags
Han, Xiaoyi
Lee, Lung-fei
[J]. REGIONAL SCIENCE AND URBAN ECONOMICS, 2013, 43 (05) : 816 - 837
[7] Bayesian variable selection and coefficient estimation in heteroscedastic linear regression model
Alshaybawee, Taha
Alhamzawi, Rahim
Midi, Habshah
Allyas, Intisar Ibrahim
[J]. JOURNAL OF APPLIED STATISTICS, 2018, 45 (14) : 2643 - 2657
[8] Adaptive model selection for digital linear classifiers
Boni, A
[J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2002, 2002, 2415 : 1333 - 1338
[9] Model selection and error estimation
Bartlett, PL
Boucheron, S
Lugosi, G
[J]. MACHINE LEARNING, 2002, 48 (1-3) : 85 - 113
[10] Model Selection and Error Estimation
Peter L. Bartlett
Stéphane Boucheron
Gábor Lugosi
[J]. Machine Learning, 2002, 48 : 85 - 113

← 1 2 3 4 5 →