Parallel Integration of Heterogeneous Genome-Wide Data Sources

被引:0
|
作者
Greene, Derek [1 ]
Bryan, Kenneth [1 ]
Cunningham, Padraig [1 ]
机构
[1] Univ Coll Dublin, Sch Informat & Comp Sci, Dublin, Ireland
关键词
D O I
暂无
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Heterogeneous genome-wide data sources capture information on various aspects of complex biological systems. For instance, transcriptome, interactome and phenome-level information may be derived from mRNA expression data, protein-protein interaction networks, and biomedical literature corpora. Each source provides a distinct "view" of the same domain, but potentially encodes different biologically-relevant patterns. Effective integration of such views can provide a richer, more informative model of an organism's functional modules than that produced on a single view alone. Existing machine learning strategies for information fusion largely focus on the production of a consensus model that reflects patterns shared between views. However, the information provided by different views may not always be easily reconciled, due to the incomplete nature of the data, or the fact that some patterns will be present in one view but not in another. To address this problem, we present the Parallel Integration Clustering Algorithm (PICA), a novel cluster analysis approach which supports the simultaneous integration of information from two or more sources. The resulting model preserves patterns that are unique to individual views, as well as those common to all views. We demonstrate the effectiveness of PICA in identifying significant patterns corresponding to functional groupings, when applied to three genome-wide datasets.
引用
收藏
页码:368 / 374
页数:7
相关论文
共 50 条
  • [21] Simultaneous analysis of genome-wide SNP data
    Hoggart, C. J.
    De Iorio, M.
    Whittaker, J. C.
    Balding, D. J.
    [J]. GENETIC EPIDEMIOLOGY, 2007, 31 (06) : 609 - 609
  • [22] Calibration of variant effect predictors on genome-wide data masks heterogeneous performance across genes
    Tejura, Malvika
    Fayer, Shawn
    McEwen, Abbye E.
    Flynn, Jake
    Starita, Lea M.
    Fowler, Douglas M.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2024, 111 (09)
  • [23] Hierarchical Modelling for Genome-Wide Association Data
    Heron, Eleisa
    O'Dushlaine, Colm
    [J]. ANNALS OF HUMAN GENETICS, 2009, 73 : 665 - 665
  • [24] Ethnicity, Body Mass, and Genome-Wide Data
    Boardman, Jason D.
    Blalock, Casey L.
    Corley, Robin P.
    Stallings, Michael C.
    Domingue, Benjamin W.
    McQueen, Matthew B.
    Crowley, Thomas J.
    Hewitt, John K.
    Lu, Ying
    Field, Samuel H.
    [J]. BIODEMOGRAPHY AND SOCIAL BIOLOGY, 2010, 56 (02) : 123 - 136
  • [25] Detecting Relatives from Genome-Wide Data
    Sun, Meng
    [J]. HUMAN HEREDITY, 2013, 76 (02) : 97 - 97
  • [26] Genome-Wide Association Mapping With Longitudinal Data
    Furlotte, Nicholas A.
    Eskin, Eleazar
    Eyheramendy, Susana
    [J]. GENETIC EPIDEMIOLOGY, 2012, 36 (05) : 463 - 471
  • [27] Genome-wide association of autoimmune neuroinflammation in the heterogeneous stock of rats
    Stridh, Pernilla
    Jagodic, Maja
    Ockinger, Johan
    Beyeen, Amennai
    Gillett, Alan
    Ortlieb, Andre
    Abdelmagid, Nada
    Diez, Margarita
    Olsson, Tomas
    [J]. JOURNAL OF NEUROIMMUNOLOGY, 2012, 253 (1-2) : 61 - 61
  • [28] Microarray data integration for genome-wide analysis of human tissue-selective gene expression
    Wang, Liangjiang
    Srivastava, Anand K.
    Schwartz, Charles E.
    [J]. BMC GENOMICS, 2010, 11
  • [29] Microarray data integration for genome-wide analysis of human tissue-selective gene expression
    Liangjiang Wang
    Anand K Srivastava
    Charles E Schwartz
    [J]. BMC Genomics, 11
  • [30] Parallel Genome-Wide Analysis With Central And Graphic Processing Units
    Kacamarga, Muhamad Fitra
    Baurley, James W.
    Pardamean, Bens
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2015, : 265 - 269