Selecting high-dimensional mixed graphical models using minimal AIC or BIC forests

被引:56
|
作者
Edwards, David [1 ]
de Abreu, Gabriel C. G. [1 ]
Labouriau, Rodrigo [1 ]
机构
[1] Aarhus Univ, Fac Agr Sci, Inst Genet & Biotechnol, Aarhus, Denmark
来源
BMC BIOINFORMATICS | 2010年 / 11卷
关键词
MAXIMUM-LIKELIHOOD; NETWORKS; RECONSTRUCTION; TREES;
D O I
10.1186/1471-2105-11-18
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Chow and Liu showed that the maximum likelihood tree for multivariate discrete distributions may be found using a maximum weight spanning tree algorithm, for example Kruskal's algorithm. The efficiency of the algorithm makes it tractable for high-dimensional problems. Results: We extend Chow and Liu's approach in two ways: first, to find the forest optimizing a penalized likelihood criterion, for example AIC or BIC, and second, to handle data with both discrete and Gaussian variables. We apply the approach to three datasets: two from gene expression studies and the third from a genetics of gene expression study. The minimal BIC forest supplements a conventional analysis of differential expression by providing a tentative network for the differentially expressed genes. In the genetics of gene expression context the method identifies a network approximating the joint distribution of the DNA markers and the gene expression levels. Conclusions: The approach is generally useful as a preliminary step towards understanding the overall dependence structure of high-dimensional discrete and/or continuous data. Trees and forests are unrealistically simple models for biological systems, but can provide useful insights. Uses include the following: identification of distinct connected components, which can be analysed separately (dimension reduction); identification of neighbourhoods for more detailed analyses; as initial models for search algorithms with a larger search space, for example decomposable models or Bayesian networks; and identification of interesting features, such as hub nodes.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Selecting high-dimensional mixed graphical models using minimal AIC or BIC forests
    David Edwards
    Gabriel CG de Abreu
    Rodrigo Labouriau
    [J]. BMC Bioinformatics, 11 (1)
  • [2] High-Dimensional Mixed Graphical Models
    Cheng, Jie
    Li, Tianxi
    Levina, Elizaveta
    Zhu, Ji
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2017, 26 (02) : 367 - 378
  • [3] High-dimensional undirected graphical models for arbitrary mixed data
    Goebler, Konstantin
    Drton, Mathias
    Mukherjee, Sach
    Miloschewski, Anne
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2024, 18 (01): : 2339 - 2404
  • [4] Asymptotics of AIC, BIC and Cp model selection rules in high-dimensional regression
    Bai, Zhidong
    Choi, Kwok Pui
    Fujikoshi, Yasunori
    Hu, Jiang
    [J]. BERNOULLI, 2022, 28 (04) : 2375 - 2403
  • [5] Evaluating the performance of AIC and BIC for selecting spatial econometric models
    Christos Agiakloglou
    Apostolos Tsimpanos
    [J]. Journal of Spatial Econometrics, 2023, 4 (1):
  • [6] CONSISTENCY OF AIC AND BIC IN ESTIMATING THE NUMBER OF SIGNIFICANT COMPONENTS IN HIGH-DIMENSIONAL PRINCIPAL COMPONENT ANALYSIS
    Bai, Zhidong
    Choi, Kwok Pui
    Fujikoshi, Yasunori
    [J]. ANNALS OF STATISTICS, 2018, 46 (03): : 1050 - 1076
  • [7] Nonparametric and high-dimensional functional graphical models
    Solea, Eftychia
    Dette, Holger
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2022, 16 (02): : 6175 - 6231
  • [8] Estimation of high-dimensional graphical models using regularized score matching
    Lin, Lina
    Drton, Mathias
    Shojaie, Ali
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2016, 10 (01): : 806 - 854
  • [9] mgm: Estimating Time-Varying Mixed Graphical Models in High-Dimensional Data
    Haslbeck, Jonas M. B.
    Waldorp, Lourens J.
    [J]. JOURNAL OF STATISTICAL SOFTWARE, 2020, 93 (08):
  • [10] A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration
    Oda, Ryoya
    Mima, Yoshie
    Yanagihara, Hirokazu
    Fujikoshi, Yasunori
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (14) : 3453 - 3476