BAYESIAN MULTIPLE INSTANCE CLASSIFICATION BASED ON HIERARCHICAL PROBIT REGRESSION

被引:0
|
作者
Xiong, Danyi [1 ]
Park, Seongoh [2 ]
Lim, Johan [3 ]
Wang, Tao [4 ]
Wang, Xinlei [5 ]
机构
[1] Southern Methodist Univ, Dept Stat Sci, Dallas, TX 75275 USA
[2] Sungshin Womens Univ, Dept Stat, Seoul, South Korea
[3] Seoul Natl Univ, Dept Stat, Seoul, South Korea
[4] UT Southwestern Med Ctr, Quantitat Biomed Res Ctr, Peter ODonnell Jr Sch Publ Hlth, Dallas, TX USA
[5] Univ Texas Arlington, Dept Math, Arlington, TX 76019 USA
来源
ANNALS OF APPLIED STATISTICS | 2024年 / 18卷 / 01期
基金
新加坡国家研究基金会;
关键词
Binary classification; Bayesian inference; Gibbs sampling; primary instance; weakly supervised learning; BINARY; CANCER; MODELS;
D O I
10.1214/23-AOAS1780
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In multiple instance learning (MIL), the response variable is predicted by features (or covariates) of one or more instances, which are collectively denoted as a bag. Learning the relationship between bags and instances is challenging because of the unknown and possibly complicated data generating mechanism regarding how instances contribute to the bag label. MIL has been applied to solve a variety of real -world problems, which have been mostly focused on supervised tasks, such as molecule activity prediction, protein binding affinities prediction, object detection, and computer -aided diagnosis. However, to date, the majority of the off -the -shelf MIL methods are developed in the computer science domain, and they focus on improving the prediction performance while spending little effort on explainability of the algorithm. In this article a Bayesian multiple instance learning model, based on probit regression (MICProB), is proposed, which contributes a significant portion to the suite of statistical methodologies for MIL. MICProB is composed of two nested probit regression models, where the inner model is estimated for predicting primary instances, which are considered as the "important" ones that determine the bag label, and the outer model is for predicting bag -level responses based on the primary instances estimated by the inner model. The posterior distribution of MICProB can be conveniently approximated using a Gibbs sampler, and the prediction for new bags can be performed in a fully integrated Bayesian way. We evaluate the performance of MICProB against 15 benchmark methods and demonstrate its competitiveness in simulation and real -data examples. In addition to its capability of identifying primary instances, as compared to existing optimization -based approaches, MICProB also enjoys great advantages in providing a transparent model structure, straightforward statistical inference of quantities related to model parameters, and favorable interpretability of covariate effects on the bag -level response.
引用
收藏
页码:80 / 99
页数:20
相关论文
共 50 条
  • [1] Bayesian Nonparametric Multiple Instance Regression
    Subramanian, Saravanan
    Rana, Santu
    Gupta, Sunil
    Sivakumar, P. Bagavathi
    Velayutham, C. Shunmuga
    Venkatesh, Svetha
    [J]. 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 3661 - 3666
  • [2] Bayesian multiple instance regression for modeling immunogenic neoantigens
    Park, Seongoh
    Wang, Xinlei
    Lim, Johan
    Xiao, Guanghua
    Lu, Tianshi
    Wang, Tao
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2020, 29 (10) : 3032 - 3047
  • [3] Sensitivity analysis in classification using Bayesian smoothing spline ANOVA probit regression
    Zhang, Chunzhe
    Storlie, Curtis B.
    Lee, Thomas C. M.
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2022, 50 (03): : 928 - 950
  • [4] vbmp: Variational Bayesian multinomial probit regression for multi-class classification in R
    Lama, Nicola
    Girolami, Mark
    [J]. BIOINFORMATICS, 2008, 24 (01) : 135 - 136
  • [5] Multinomial probit Bayesian additive regression trees
    Kindo, Bereket P.
    Wang, Hao
    Pena, Edsel A.
    [J]. STAT, 2016, 5 (01): : 119 - 131
  • [6] DEEP HIERARCHICAL MULTIPLE INSTANCE LEARNING FOR WHOLE SLIDE IMAGE CLASSIFICATION
    Zhou, Yuanpin
    Lu, Yao
    [J]. 2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (IEEE ISBI 2022), 2022,
  • [7] The Multiple Instance Learning Gaussian Process Probit Model
    Wang, Fulton
    Pinar, Ali
    [J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [8] Variable selection in Bayesian multiple instance regression using shotgun stochastic search
    Park, Seongoh
    Kim, Joungyoun
    Wang, Xinlei
    Lim, Johan
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2024, 196
  • [9] Multi-class cancer classification using multinomial probit regression with Bayesian gene selection
    Zhou, X
    Wang, X
    Dougherty, ER
    [J]. IEE PROCEEDINGS SYSTEMS BIOLOGY, 2006, 153 (02): : 70 - 78
  • [10] Sparse bayesian kernel multinomial probit regression model for high-dimensional data classification
    Yang, Aijun
    Jiang, Xuejun
    Shu, Lianjie
    Liu, Pengfei
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2019, 48 (01) : 165 - 176