Parallel Integration of Heterogeneous Genome-Wide Data Sources

被引:0
|
作者
Greene, Derek [1 ]
Bryan, Kenneth [1 ]
Cunningham, Padraig [1 ]
机构
[1] Univ Coll Dublin, Sch Informat & Comp Sci, Dublin, Ireland
关键词
D O I
暂无
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Heterogeneous genome-wide data sources capture information on various aspects of complex biological systems. For instance, transcriptome, interactome and phenome-level information may be derived from mRNA expression data, protein-protein interaction networks, and biomedical literature corpora. Each source provides a distinct "view" of the same domain, but potentially encodes different biologically-relevant patterns. Effective integration of such views can provide a richer, more informative model of an organism's functional modules than that produced on a single view alone. Existing machine learning strategies for information fusion largely focus on the production of a consensus model that reflects patterns shared between views. However, the information provided by different views may not always be easily reconciled, due to the incomplete nature of the data, or the fact that some patterns will be present in one view but not in another. To address this problem, we present the Parallel Integration Clustering Algorithm (PICA), a novel cluster analysis approach which supports the simultaneous integration of information from two or more sources. The resulting model preserves patterns that are unique to individual views, as well as those common to all views. We demonstrate the effectiveness of PICA in identifying significant patterns corresponding to functional groupings, when applied to three genome-wide datasets.
引用
收藏
页码:368 / 374
页数:7
相关论文
共 50 条
  • [1] Searching for multiple sclerosis genomic candidate regions by genome-wide integration of heterogeneous genomic data sources
    Maver, A.
    Peterlin, B.
    [J]. EUROPEAN JOURNAL OF NEUROLOGY, 2012, 19 : 736 - 736
  • [2] Genome-Wide Computational Function Prediction of Arabidopsis Proteins by Integration of Multiple Data Sources
    Kourmpetis, Yiannis A. I.
    van Dijk, Aalt D. J.
    van Ham, Roeland C. H. J.
    ter Braak, Cajo J. F.
    [J]. PLANT PHYSIOLOGY, 2011, 155 (01) : 271 - 281
  • [3] Interpretation of psychiatric genome-wide association studies with multispecies heterogeneous functional genomic data integration
    Timothy Reynolds
    Emma C. Johnson
    Spencer B. Huggett
    Jason A. Bubier
    Rohan H. C. Palmer
    Arpana Agrawal
    Erich J. Baker
    Elissa J. Chesler
    [J]. Neuropsychopharmacology, 2021, 46 : 86 - 97
  • [4] Interpretation of psychiatric genome-wide association studies with multispecies heterogeneous functional genomic data integration
    Reynolds, Timothy
    Johnson, Emma C.
    Huggett, Spencer B.
    Bubier, Jason A.
    Palmer, Rohan H. C.
    Agrawal, Arpana
    Baker, Erich J.
    Chesler, Elissa J.
    [J]. NEUROPSYCHOPHARMACOLOGY, 2021, 46 (01) : 86 - 97
  • [5] Unraveling Regulatory Interactions by the Integration of Genome-wide Location Data and Mutant Data
    Liu, Qi
    Jiang, Lihua
    Deng, Yong
    [J]. 2010 4TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING (ICBBE 2010), 2010,
  • [6] Probabilistic Protein Function Prediction from Heterogeneous Genome-Wide Data
    Nariai, Naoki
    Kolaczyk, Eric D.
    Kasif, Simon
    [J]. PLOS ONE, 2007, 2 (03):
  • [7] Efficient genome-wide genotyping strategies and data integration in crop plants
    Torkamaneh, Davoud
    Boyle, Brian
    Belzile, Francois
    [J]. THEORETICAL AND APPLIED GENETICS, 2018, 131 (03) : 499 - 511
  • [8] Efficient genome-wide genotyping strategies and data integration in crop plants
    Davoud Torkamaneh
    Brian Boyle
    François Belzile
    [J]. Theoretical and Applied Genetics, 2018, 131 : 499 - 511
  • [9] Genome-wide analysis of retroviral DNA integration
    Frederic Bushman
    Mary Lewinski
    Angela Ciuffi
    Stephen Barr
    Jeremy Leipzig
    Sridhar Hannenhalli
    Christian Hoffmann
    [J]. Nature Reviews Microbiology, 2005, 3 : 848 - 858
  • [10] Predicting eukaryotic transcriptional cooperativity by Bayesian network integration of genome-wide data
    Wang, Yong
    Zhang, Xiang-Sun
    Xia, Yu
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 (18) : 5943 - 5958