On the Interpretability of Machine Learning Models and Experimental Feature Selection in Case of Multicollinear Data

被引：18

作者：

Drobnic, Franc ^{[1
]}

Kos, Andrej ^{[1
]}

Pustisek, Matevz ^{[1
]}

机构：

[1] Univ Ljubljana, Fac Elect Engn, Trzaska Cesta 25, Ljubljana 1000, Slovenia

来源：

ELECTRONICS | 2020年 / 9卷 / 05期

关键词：

interpretable machine learning; feature multicollinearity; random forests; feature selection; feature importance; greedy feature selection;

D O I：

10.3390/electronics9050761

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the field of machine learning, a considerable amount of research is involved in the interpretability of models and their decisions. The interpretability contradicts the model quality. Random Forests are among the best quality technologies of machine learning, but their operation is of "black box" character. Among the quantifiable approaches to the model interpretation, there are measures of association of predictors and response. In case of the Random Forests, this approach usually consists of calculating the model's feature importances. Known methods, including the built-in one, are less suitable in settings with strong multicollinearity of features. Therefore, we propose an experimental approach to the feature selection task, a greedy forward feature selection method with least-trees-used criterion. It yields a set of most informative features that can be used in a machine learning (ML) training process with similar prediction quality as the original feature set. We verify the results of the proposed method on two known datasets, one with small feature multicollinearity and another with large feature multicollinearity. The proposed method also allows for a domain expert help with selecting among equally important features, which is known as the human-in-the-loop approach.

引用

页数：15

共 50 条

[21] Data Classification Using Feature Selection And kNN Machine Learning Approach
Begum, Shemim
Chakraborty, Debasis
Sarkar, Ram
[J]. 2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 811 - 814
[22] Feature Selection in Pulmonary Function Test Data with Machine Learning Methods
Karakis, Rukiye
Guler, Inan
Isik, Ali Hakan
[J]. 2013 21ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2013,
[23] Data Driven Feature Selection for Machine Learning Algorithms in Computer Vision
Zhang, Fan
Li, Wei
Zhang, Yifan
Feng, Zhiyong
[J]. IEEE INTERNET OF THINGS JOURNAL, 2018, 5 (06): : 4262 - 4272
[24] Probabilistic Feature Selection in Machine Learning
Ghosh, Indrajit
[J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2018, PT I, 2018, 10841 : 623 - 632
[25] A Deep Feature Learning Model for Pneumonia Detection Applying a Combination of mRMR Feature Selection and Machine Learning Models
Togacar, M.
Ergen, B.
Comert, Z.
Ozyurt, F.
[J]. IRBM, 2020, 41 (04) : 212 - 222
[26] A Comprehensive Review of Feature Selection and Feature Selection Stability in Machine Learning
Buyukkececi, Mustafa
Okur, Mehmet Cudi
[J]. GAZI UNIVERSITY JOURNAL OF SCIENCE, 2023, 36 (04): : 1506 - 1520
[27] Accuracy, Fairness, and Interpretability of Machine Learning Criminal Recidivism Models
Ingram, Eric
Gursoy, Furkan
Kakadiaris, Ioannis A.
[J]. 2022 IEEE/ACM INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING, APPLICATIONS AND TECHNOLOGIES, BDCAT, 2022, : 233 - 241
[28] Feature fusion improves performance and interpretability of machine learning models in identifying soil pollution of potentially contaminated sites
Lu, Xiaosong
Du, Junyang
Zheng, Liping
Wang, Guoqing
Li, Xuzhi
Sun, Li
Huang, Xinghua
[J]. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY, 2023, 259
[29] Applying Genetic Programming to Improve Interpretability in Machine Learning Models
Ferreira, Leonardo Augusto
Guimaraes, Frederico Gadelha
Silva, Rodrigo
[J]. 2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,
[30] Approach to provide interpretability in machine learning models for image classification
Anja Stadlhofer
Vitaliy Mezhuyev
[J]. Industrial Artificial Intelligence, 1 (1):

← 1 2 3 4 5 →