Microbial genotype-phenotype mapping by class association rule mining

被引:34
|
作者
Tamura, Makio [1 ]
D'haeseleer, Patrik [1 ]
机构
[1] Lawrence Livermore Natl Lab, Comp Applicat & Res Dept, Chem Mat Earth & Life Sci Dept, Microbial Syst Biol Grp, Livermore, CA 94550 USA
关键词
D O I
10.1093/bioinformatics/btn210
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Microbial phenotypes are typically due to the concerted action of multiple gene functions, yet the presence of each gene may have only a weak correlation with the observed phenotype. Hence, it may be more appropriate to examine co-occurrence between sets of genes and a phenotype (multiple-to-one) instead of pairwise relations between a single gene and the phenotype. Here, we propose an efficient class association rule mining algorithm, NETCAR, in order to extract sets of COGs (clusters of orthologous groups of proteins) associated with a phenotype from COG phylogenetic profiles and a phenotype profile. NETCAR takes into account the phylogenetic co-occurrence graph between COGs to restrict hypothesis space, and uses mutual information to evaluate the biconditional relation. Results: We examined the mining capability of pairwise and multiple-to-one association by using NETCAR to extract COGs relevant to six microbial phenotypes (aerobic, anaerobic, facultative, endospore, motility and Gram negative) from 11 969 unique COG profiles across 155 prokaryotic organisms. With the same level of false discovery rate, multiple-to-one association can extract about 10 times more relevant COGs than one-to-one association. We also reveal various topologies of association networks among COGs (modules) from extracted multiple-to-one correlation rules relevant with the six phenotypes; including a well-connected network for motility, a star-shaped network for aerobic and intermediate topologies for the other phenotypes. NETCAR outperforms a standard CAR mining algorithm, CARAPRIORI, while requiring several orders of magnitude less computational time for extracting 3-COG sets.
引用
收藏
页码:1523 / 1529
页数:7
相关论文
共 50 条
  • [1] Efficient learning of microbial genotype-phenotype association rules
    MacDonald, Norman J.
    Beiko, Robert G.
    BIOINFORMATICS, 2010, 26 (15) : 1834 - 1840
  • [2] The dynamics of the genotype-phenotype association
    Kuhnlein, U
    Parsanejad, R
    Zadworny, D
    Aggrey, SE
    POULTRY SCIENCE, 2003, 82 (06) : 876 - 881
  • [3] Genotype-phenotype mapping in another dimension
    Clyde, Dorothy
    NATURE REVIEWS GENETICS, 2019, 20 (10) : 564 - 565
  • [4] Learning an Evolvable Genotype-Phenotype Mapping
    Moreno, Matthew Andres
    Banzhaf, Wolfgang
    Ofria, Charles
    GECCO'18: PROCEEDINGS OF THE 2018 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2018, : 983 - 990
  • [5] Multivariate Analysis of Genotype-Phenotype Association
    Mitteroecker, Philipp
    Cheverud, James M.
    Pavlicev, Mihaela
    GENETICS, 2016, 202 (04) : 1345 - +
  • [6] Towards a database for genotype-phenotype association research: mining data from encyclopaedia
    Pajic, Vesna S.
    Pavlovic-Lazetic, Gordana M.
    Beljanski, Milos V.
    Brandt, Bernd W.
    Pajic, Milos B.
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2013, 7 (02) : 196 - 213
  • [7] Genotype-phenotype mapping: genes as computer programs
    Kell, DB
    TRENDS IN GENETICS, 2002, 18 (11) : 555 - 559
  • [8] Adapting a genotype-phenotype mapping to phenotypic complexity
    Hartmann, Morten
    Goedeweeck, Tim
    PROCEEDINGS OF THE 2009 NASA/ESA CONFERENCE ON ADAPTIVE HARDWARE AND SYSTEMS, 2009, : 35 - +
  • [9] Bridging the genotype-phenotype mapping for digital FPGAs
    Haddow, PC
    Tufte, G
    THIRD NASA/DOD WORKSHOP ON EVOLVABLE HARDWARE, PROCEEDINGS, 2001, : 109 - 115
  • [10] Issues in designing a neutral genotype-phenotype mapping
    Shipman, R
    Shackleton, M
    CEC'02: PROCEEDINGS OF THE 2002 CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1 AND 2, 2002, : 1360 - 1365