A user-guided Bayesian framework for ensemble feature selection in life science applications (UBayFS)

被引:5
|
作者
Jenul, Anna [1 ]
Schrunner, Stefan [1 ]
Pilz, Jurgen [2 ]
Tomic, Oliver [1 ]
机构
[1] Norwegian Univ Life Sci, Dept Data Sci, As, Norway
[2] Univ Klagenfurt, Dept Stat, Klagenfurt, Austria
关键词
Ensemble feature selection; Bayesian model; Dirichlet-multinomial; User constraints; CANCER; CLASSIFICATION; DIAGNOSIS;
D O I
10.1007/s10994-022-06221-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection reduces the complexity of high-dimensional datasets and helps to gain insights into systematic variation in the data. These aspects are essential in domains that rely on model interpretability, such as life sciences. We propose a (U)ser-Guided (Bay)esian Framework for (F)eature (S)election, UBayFS, an ensemble feature selection technique embedded in a Bayesian statistical framework. Our generic approach considers two sources of information: data and domain knowledge. From data, we build an ensemble of feature selectors, described by a multinomial likelihood model. Using domain knowledge, the user guides UBayFS by weighting features and penalizing feature blocks or combinations, implemented via a Dirichlet-type prior distribution. Hence, the framework combines three main aspects: ensemble feature selection, expert knowledge, and side constraints. Our experiments demonstrate that UBayFS (a) allows for a balanced trade-off between user knowledge and data observations and (b) achieves accurate and robust results.
引用
收藏
页码:3897 / 3923
页数:27
相关论文
共 50 条
  • [21] Toward a user-guided manipulation framework for high-DOF robots with limited communication
    Phillips-Grafflin, Calder
    Alunni, Nicholas
    Suay, Halit Bener
    Mainprice, Jim
    Lofaro, Daniel
    Berenson, Dmitry
    Chernova, Sonia
    Lindeman, Robert W.
    Oh, Paul
    INTELLIGENT SERVICE ROBOTICS, 2014, 7 (03) : 121 - 131
  • [22] Evolutionary multiobjective ensemble learning based on Bayesian feature selection
    Chen, Huanhuan
    Yao, Xin
    2006 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-6, 2006, : 267 - +
  • [23] Ensemble feature selection with the simple Bayesian classification in medical diagnostics
    Tsymbal, A
    Puuronen, S
    PROCEEDINGS OF THE 15TH IEEE SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, 2002, : 225 - 230
  • [24] A Dropout Prediction Framework Combined with Ensemble Feature Selection
    Ai, Dan
    Zhang, Tiancheng
    Yu, Ge
    Shao, Xinying
    ICIET 2020: 2020 8TH INTERNATIONAL CONFERENCE ON INFORMATION AND EDUCATION TECHNOLOGY, 2020, : 179 - 185
  • [25] New user-guided and ckpt-based checkpointing libraries for parallel MPI applications
    Czarnul, P
    Fraczak, M
    RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, PROCEEDINGS, 2005, 3666 : 351 - 358
  • [26] DARPA Robotics Challenge: Towards a User-Guided Manipulation Framework for High-DOF Robots
    Alunni, Nicholas
    Suay, Halit Bener
    Phillips-Grafflin, Calder
    Mainprice, Jim
    Berenson, Dmitry
    Chernova, Sonia
    Lindeman, Robert W.
    Lofaro, Daniel
    Oh, Paul
    2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2014, : 2088 - 2088
  • [27] Measuring the Stability of Feature Selection with Applications to Ensemble Methods
    Nogueira, Sarah
    Brown, Gavin
    MULTIPLE CLASSIFIER SYSTEMS (MCS 2015), 2015, 9132 : 135 - 146
  • [28] Feature Selection for Naive Bayesian Network Ensemble using Evolutionary Algorithms
    Zagorecki, Adam
    FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2014, 2014, 2 : 381 - 385
  • [29] PathSelClus: Integrating Meta-Path Selection with User-Guided Object Clustering in Heterogeneous Information Networks
    Sun, Yizhou
    Norick, Brandon
    Han, Jiawei
    Yan, Xifeng
    Yu, Philip S.
    Yu, Xiao
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2013, 7 (03)
  • [30] A study of ensemble feature selection and adversarial training for malicious user detection
    Zhang, Linjie
    Zhu, Xiaoyan
    Ma, Jianfeng
    CHINA COMMUNICATIONS, 2023, 20 (10) : 212 - 229