Ensemble learning method for classification: Integrating data envelopment analysis with machine learning

被引:0
|
作者
An, Qingxian [1 ,2 ]
Huang, Siwei [1 ]
Han, Yuxuan [1 ]
Zhu, You [3 ,4 ]
机构
[1] Cent South Univ, Sch Business, Changsha 410083, Peoples R China
[2] Hefei Univ Technol, Sch Econ, Hefei 230601, Peoples R China
[3] Hunan Univ, Business Sch, Changsha 410082, Peoples R China
[4] Hunan Prov Key Lab Philosophy & Social Sci Ind Dig, Changsha 410082, Peoples R China
基金
中国国家自然科学基金;
关键词
Ensemble learning; Data envelopment analysis; Classifier; Large dataset; STATISTICAL COMPARISONS; CLASSIFIERS; EFFICIENCY; DEA;
D O I
10.1016/j.cor.2024.106739
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In classification tasks with large sample sets, the use of a single classifier carries the risk of overfitting. To overcome this issue, an ensemble of classifier models has often been shown to outperform the use of a single "best" model. Given the rich variety of classifier models available, the selection of the high-efficiency classifiers for a given task dataset remains an urgent challenge. However, most of the previous classifier selection methods only focus on the measurement of classification output performance without considering the computational cost. This paper proposes a new ensemble learning method to improve the classification quality for big datasets by using data envelopment analysis. It contains the following two stages: classifier selection and classifier combination. In the first stage, the commonly used classifiers are evaluated on the basis of their performance on resource consumption and classification output performance using the range directional model (RDM); then, the most efficient classifiers are selected. In the second stage, the classifier confusion matrix is evaluated using the data envelopment analysis (DEA) cross-efficiency model. Then, the weight for the classifier combination is determined to ensure that classifiers with higher performance have greater weights based on the cross-efficiency values. Experimental results demonstrate the superiority of the cross-efficiency model over the BCC model and the benchmark voting method in model ensemble. Furthermore, our method has been shown to save more computational resources and yields better results than existing methods.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Combining Data Envelopment Analysis and Machine Learning
    Guerrero, Nadia M.
    Aparicio, Juan
    Valero-Carreras, Daniel
    [J]. MATHEMATICS, 2022, 10 (06)
  • [2] An Ensemble Extreme Learning Machine for Data Stream Classification
    Yang, Rui
    Xu, Shuliang
    Feng, Lin
    [J]. ALGORITHMS, 2018, 11 (07)
  • [3] New ensemble machine learning method for classification and prediction on gene expression data
    Wang, Ching Wei
    [J]. 2006 28TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-15, 2006, : 60 - 63
  • [4] Efficiency evaluation of electricity distribution companies: Integrating data envelopment analysis and machine learning for a holistic analysis
    Omrani, Hashem
    Emrouznejad, Ali
    Teplova, Tamara
    Amini, Mohaddeseh
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133
  • [5] Data envelopment analysis classification machine
    Yan, Hong
    Wei, Quanling
    [J]. INFORMATION SCIENCES, 2011, 181 (22) : 5029 - 5041
  • [6] A novel ensemble machine learning for robust microarray data classification
    Peng, Yonghong
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2006, 36 (06) : 553 - 573
  • [7] Imbalanced Data Classification Method Based on Ensemble Learning
    Xiang, Yu
    Xie, Yongping
    [J]. COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, CSPS 2018, VOL III: SYSTEMS, 2020, 517 : 18 - 24
  • [8] Breast Tumor Classification Using an Ensemble Machine Learning Method
    Assiri, Adel S.
    Nazir, Saima
    Velastin, Sergio A.
    [J]. JOURNAL OF IMAGING, 2020, 6 (06)
  • [9] Data Learning: Integrating Data Assimilation and Machine Learning
    Buizza, Caterina
    Casas, Cesar Quilodran
    Nadler, Philip
    Mack, Julian
    Marrone, Stefano
    Titus, Zainab
    Le Cornec, Clemence
    Heylen, Evelyn
    Dur, Tolga
    Ruiz, Luis Baca
    Heaney, Claire
    Lopez, Julio Amador Diaz
    Kumar, K. S. Sesh
    Arcucci, Rossella
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2022, 58
  • [10] Novel Machine Learning Method Integrating Ensemble Learning and Deep Learning for Mapping Debris-Covered Glaciers
    Lu, Yijie
    Zhang, Zhen
    Shangguan, Donghui
    Yang, Junhua
    [J]. REMOTE SENSING, 2021, 13 (13)