Gaussian Graphical Model Exploration and Selection in High Dimension Low Sample Size Setting

被引：5

作者：

Lartigue, Thomas ^{[1
,2
]}

Bottani, Simona ^{[3
]}

Baron, Stephanie ^{[4
]}

Colliot, Olivier ^{[3
]}

Durrleman, Stanley ^{[3
]}

Allassonniere, Stephanie ^{[5
]}

机构：

[1] Ecole Polytech, IP, CMAP, CNRS, Paris, France

[2] INRIA, Aramis Project Team, F-91128 Palaiseau, France

[3] Sorbonne Univ, CNRS UMR 7225, Inserm U1127, Aramis Project Team,Inria,Inst Cerveau & Moelle E, F-75004 Paris, France

[4] Hop Europeen Georges Pompidou, AP HP, F-75015 Paris, France

[5] Sorbonne Univ, INSERM, Univ Paris, Ctr Rech Cordeliers, F-75006 Paris, France

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2021年 / 43卷 / 09期

基金：

欧洲研究理事会;

关键词：

Correlation; Covariance matrices; Measurement; Graphical models; Gaussian distribution; Sparse representation; Alzheimer's disease; Gaussian graphical models; model selection; high dimension low sample size; sparse matrices; maximum likelihood estimation; MAXIMUM-LIKELIHOOD-ESTIMATION; COVARIANCE ESTIMATION; SPARSE ESTIMATION; LASSO;

D O I：

10.1109/TPAMI.2020.2980542

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Gaussian graphical models (GGM) are often used to describe the conditional correlations between the components of a random vector. In this article, we compare two families of GGM inference methods: the nodewise approach and the penalised likelihood maximisation. We demonstrate on synthetic data that, when the sample size is small, the two methods produce graphs with either too few or too many edges when compared to the real one. As a result, we propose a composite procedure that explores a family of graphs with a nodewise numerical scheme and selects a candidate among them with an overall likelihood criterion. We demonstrate that, when the number of observations is small, this selection method yields graphs closer to the truth and corresponding to distributions with better KL divergence with regards to the real distribution than the other two. Finally, we show the interest of our algorithm on two concrete cases: first on brain imaging data, then on biological nephrology data. In both cases our results are more in line with current knowledge in each field.

引用

页码：3196 / 3213

页数：18

共 50 条

[31] Some considerations of classification for high dimension low-sample size data
Zhang, Lingsong
Lin, Xihong
[J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2013, 22 (05) : 537 - 550
[32] Random forest kernel for high-dimension low sample size classification
Lucca Portes Cavalheiro
Simon Bernard
Jean Paul Barddal
Laurent Heutte
[J]. Statistics and Computing, 2024, 34
[33] Comparison of binary discrimination methods for high dimension low sample size data
Bolivar-Cime, A.
Marron, J. S.
[J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2013, 115 : 108 - 121
[34] On Some Fast And Robust Classifiers For High Dimension, Low Sample Size Data
Roy, Sarbojit
Choudhury, Jyotishka Ray
Dutta, Subhajit
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
[35] Random forest kernel for high-dimension low sample size classification
Cavalheiro, Lucca Portes
Bernard, Simon
Barddal, Jean Paul
Heutte, Laurent
[J]. STATISTICS AND COMPUTING, 2024, 34 (01)
[36] Maximum Projection Distance Classifier for High Dimension and Low Sample Size Problems
Zhang, Zhiwang
He, Jing
Cao, Jie
Li, Shuqing
Ji, Yimu
Qian, Gang
Li, Xingsen
Zhang, Kai
Wang, Pingjiang
[J]. PROCEEDINGS OF 2021 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY WORKSHOPS AND SPECIAL SESSIONS: (WI-IAT WORKSHOP/SPECIAL SESSION 2021), 2021, : 334 - 339
[37] Robust Bayesian model selection for variable clustering with the Gaussian graphical model
Daniel Andrade
Akiko Takeda
Kenji Fukumizu
[J]. Statistics and Computing, 2020, 30 : 351 - 376
[38] Robust Bayesian model selection for variable clustering with the Gaussian graphical model
Andrade, Daniel
Takeda, Akiko
Fukumizu, Kenji
[J]. STATISTICS AND COMPUTING, 2020, 30 (02) : 351 - 376
[39] A classifier under the strongly spiked eigenvalue model in high-dimension, low-sample-size context
Ishii, Aki
[J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2020, 49 (07) : 1561 - 1577
[40] A dimension reduction technique applied to regression on high dimension, low sample size neurophysiological data sets
Adrielle C. Santana
Adriano V. Barbosa
Hani C. Yehia
Rafael Laboissière
[J]. BMC Neuroscience, 22

← 1 2 3 4 5 →