Integrating database homology in a probabilistic gene structure model

被引:0
|
作者
Kulp, D
Haussler, D
Reese, MG
Eeckman, FH
机构
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We present an improved stochastic model of genes in DNA, and describe a method for integrating database homology into the probabilistic framework. A generalized hidden Markov model (GHMM) describes the grammar of a legal parse of a DNA sequence. Probabilities are estimated for gene features by using dynamic programming to combine information from multiple sensors. We show how matches to homologous sequences from a database can be integrated into the probability estimation by interpreting the likelihood of a sequence in terms of the bit-cost to encode a sequence given a homology match. We also demonstrate how homology matches in protein databases can be exploited to help identify splice sites. Our experiments show significant improvements in the sensitivity and specificity of gene structure identification when these new features are added to our gene-finding system, Genie. Experimental results in tests using a standard set of annotated genes showed that Genie identified 95% of coding nucleotides correctly with a specificity of 91%, and 77% of exons were identified exactly.
引用
收藏
页码:232 / 244
页数:13
相关论文
共 50 条
  • [41] Comparison of a homology built model of angiogenin to its crystal structure
    Allen, AD
    Howlin, BJ
    Webb, GA
    JOURNAL OF MOLECULAR MODELING, 1995, 1 (03): : 150 - 160
  • [42] Object-Relational Database Structure Model and Structure Optimisation
    Auzins, Ainars
    Eiduks, Janis
    Vasilevska, Alina
    Dzenis, Reinis
    APPLIED COMPUTER SYSTEMS, 2018, 23 (01) : 28 - 36
  • [43] Cover-up: a probabilistic privacy-preserving graph database model
    Klara Stokes
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 : 15003 - 15010
  • [44] Cover-up: a probabilistic privacy-preserving graph database model
    Stokes, Klara
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2019, 14 (11) : 15003 - 15010
  • [45] Integrating genes and phenotype: a wheat–Arabidopsis–rice glycosyltransferase database for candidate gene analyses
    Pierre-Etienne Sado
    Dominique Tessier
    Marc Vasseur
    Khalil Elmorjani
    Fabienne Guillon
    Luc Saulnier
    Functional & Integrative Genomics, 2009, 9 : 43 - 58
  • [46] Integrating Multiple Database Resources to Elucidate the Gene Flow in Southeast Asian Pig Populations
    Li, Guangzhen
    Liu, Yuqiang
    Feng, Xueyan
    Diao, Shuqi
    Zhong, Zhanming
    Li, Bolang
    Teng, Jinyan
    Zhang, Wenjing
    Zeng, Haonan
    Cai, Xiaodian
    Gao, Yahui
    Liu, Xiaohong
    Yuan, Xiaolong
    Li, Jiaqi
    Zhang, Zhe
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (11)
  • [47] Integrating Database and Dialogue Design
    Klaus-Dieter Schewe
    Bettina Schewe
    Knowledge and Information Systems, 2000, 2 (1) : 1 - 32
  • [48] Integrating Cobweb with a relational database
    Lepinioti, Konstantina
    Mc Kearney, Stephen
    IMECS 2007: INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2007, : 868 - +
  • [49] MODELING AND ANALYSIS OF PROBABILISTIC REAL-TIME SYSTEMS THROUGH INTEGRATING EVENT-B AND PROBABILISTIC MODEL CHECKING
    Debbi, Hichem
    COMPUTER SCIENCE-AGH, 2022, 23 (04): : 545 - 570
  • [50] PROBABILISTIC CAUSAL MODEL FOR DIAGNOSTIC PROBLEM SOLVING - I: INTEGRATING SYMBOLIC CAUSAL INFERENCE WITH NUMERIC PROBABILISTIC INFERENCE.
    Peng, Yun
    Reggia, James A.
    1985, (SMC-17):