Combining compositional data sets introduces error in covariance network reconstruction

被引:2
|
作者
Brunner, James D. [1 ,2 ,3 ]
Robinson, Aaron J. [1 ]
Chain, Patrick S. G. [1 ]
机构
[1] Los Alamos Natl Lab, Biosci Div, Los Alamos, NM 87545 USA
[2] Los Alamos Natl Lab, Ctr Nonlinear Studies, Los Alamos, NM 87545 USA
[3] Los Alamos Natl Lab, POB 1663, Los Alamos, NM 87545 USA
来源
ISME COMMUNICATIONS | 2024年 / 4卷 / 01期
关键词
transkingdom network inference; microbiome; bacterial fungal interaction; STATISTICAL-ANALYSIS; COMMUNITIES; BACTERIAL;
D O I
10.1093/ismeco/ycae057
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Microbial communities are diverse biological systems that include taxa from across multiple kingdoms of life. Notably, interactions between bacteria and fungi play a significant role in determining community structure. However, these statistical associations across kingdoms are more difficult to infer than intra-kingdom associations due to the nature of the data involved using standard network inference techniques. We quantify the challenges of cross-kingdom network inference from both theoretical and practical points of view using synthetic and real-world microbiome data. We detail the theoretical issue presented by combining compositional data sets drawn from the same environment, e.g. 16S and ITS sequencing of a single set of samples, and we survey common network inference techniques for their ability to handle this error. We then test these techniques for the accuracy and usefulness of their intra- and inter-kingdom associations by inferring networks from a set of simulated samples for which a ground-truth set of associations is known. We show that while the two methods mitigate the error of cross-kingdom inference, there is little difference between techniques for key practical applications including identification of strong correlations and identification of possible keystone taxa (i.e. hub nodes in the network). Furthermore, we identify a signature of the error caused by transkingdom network inference and demonstrate that it appears in networks constructed using real-world environmental microbiome data.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Accuracy of phylogeny reconstruction methods combining overlapping gene data sets
    Anne Kupczok
    Heiko A Schmidt
    Arndt von Haeseler
    Algorithms for Molecular Biology, 5
  • [2] Accuracy of phylogeny reconstruction methods combining overlapping gene data sets
    Kupczok, Anne
    Schmidt, Heiko A.
    von Haeseler, Arndt
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2010, 5
  • [3] MEASUREMENT ERROR IN COMPOSITIONAL DATA
    AITCHISON, J
    SHEN, SM
    JOURNAL OF THE INTERNATIONAL ASSOCIATION FOR MATHEMATICAL GEOLOGY, 1984, 16 (06): : 637 - 650
  • [4] Direct covariance matrix estimation with compositional data
    Molstad, Aaron J.
    Ekvall, Karl Oskar
    Suder, Piotr M.
    ELECTRONIC JOURNAL OF STATISTICS, 2024, 18 (01): : 1702 - 1748
  • [5] REDUCING THE DIMENSIONALITY OF COMPOSITIONAL DATA SETS
    AITCHISON, J
    JOURNAL OF THE INTERNATIONAL ASSOCIATION FOR MATHEMATICAL GEOLOGY, 1984, 16 (06): : 617 - 635
  • [6] MEASURES OF LOCATION OF COMPOSITIONAL DATA SETS
    AITCHISON, J
    MATHEMATICAL GEOLOGY, 1989, 21 (07): : 787 - 790
  • [7] Zero replacement in compositional data sets
    Martín-Fernández, JA
    Barceló-Vidal, C
    Pawlowsky-Glahn, V
    DATA ANALYSIS, CLASSIFICATION, AND RELATED METHODS, 2000, : 155 - 160
  • [8] Covariance-Based Variable Selection for Compositional Data
    Hron, Karel
    Filzmoser, Peter
    Donevska, Sandra
    Fiserova, Eva
    MATHEMATICAL GEOSCIENCES, 2013, 45 (04) : 487 - 498
  • [9] Covariance-Based Variable Selection for Compositional Data
    Karel Hron
    Peter Filzmoser
    Sandra Donevska
    Eva Fišerová
    Mathematical Geosciences, 2013, 45 : 487 - 498
  • [10] Combining Expression Data and Knowledge Ontology for Gene Clustering and Network Reconstruction
    Wei-Po Lee
    Chung-Hsun Lin
    Cognitive Computation, 2016, 8 : 217 - 227