Exploring microbial genome sequences to identify protein families on the grid

被引:3
|
作者
Sun, Yudong [1 ]
Wipat, Anil [1 ]
Pocock, Matthew [1 ]
Lee, Peter A. [1 ]
Flanagan, Keith [1 ]
Worthington, James T. [1 ]
机构
[1] Newcastle Univ, Sch Comp Sci, Newcastle Upon Tyne NE1 7RU, Tyne & Wear, England
基金
英国生物技术与生命科学研究理事会;
关键词
genome analysis; grid; microbial genomes; protein families; Web services;
D O I
10.1109/TITB.2007.892913
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The analysis of microbial genome sequences can identify protein families that provide potential drug targets for new antibiotics. With the rapid accumulation of newly sequenced genomes, this analysis has become a computationally intensive and data-intensive problem. This paper describes the development of a Web-service-enabled, component-based, architecture to support the large-scale comparative analysis of complete microbial genome sequences and the subsequent identification of orthologues and protein families (Nflcrobase). The system is coordinated through the use of Web-service-based notifications and integrates distributed computing resources together with genomic databases to realize all-against-all comparisons for a large volume of genome sequences and to present the data in a computationally amenable format through a Web service interface. We demonstrate the use of the system in searching for orthologues and candidate protein famifies, which ultimately could lead to the identification of potential therapeutic targets.
引用
收藏
页码:435 / 442
页数:8
相关论文
共 50 条
  • [31] Year end brings two more microbial genome sequences
    Fox, J
    ASM NEWS, 1998, 64 (02): : 70 - 71
  • [32] Reconstruction of ancient Operons from complete microbial genome sequences
    Wang, YH
    Rose, JP
    Wang, BC
    Lin, DW
    PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, : 486 - 486
  • [33] A grid-based system for microbial genome comparison and analysis
    Sun, YD
    Wipat, A
    Pocock, M
    Lee, PA
    Watson, P
    Flanagan, K
    Worthington, JT
    2005 IEEE International Symposium on Cluster Computing and the Grid, Vols 1 and 2, 2005, : 977 - 984
  • [34] Exploring protein families with Profile-QSAR
    Tian, Li
    Martin, Eric
    Polyakov, Valery
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2015, 250
  • [35] DeepTumour: Identify tumor origin from whole genome sequences.
    Stein, Lincoln David
    Jiao, Wei
    Atwal, Gurnit
    Morris, Quaid
    CANCER RESEARCH, 2022, 82 (12)
  • [36] PhylOligo: a package to identify contaminant or untargeted organism sequences in genome assemblies
    Mallet, Ludovic
    Bitard-Feildel, Tristan
    Cerutti, Franck
    Chiapello, Helene
    BIOINFORMATICS, 2017, 33 (20) : 3283 - 3285
  • [37] Detection of Protein Domains in Eukaryotic Genome Sequences
    Parikesit, Arli A.
    Stadler, Peter F.
    Prohaska, Sonja J.
    ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2010, 6268 : 71 - 74
  • [38] GlobPlot: exploring protein sequences for globularity and disorder
    Linding, R
    Russell, RB
    Neduva, V
    Gibson, TJ
    NUCLEIC ACIDS RESEARCH, 2003, 31 (13) : 3701 - 3708
  • [39] Exploring the molecular phylogeny of phasmids with whole mitochondrial genome sequences
    Komoto, Natuo
    Yukuhiro, Kenji
    Ueda, Kyoichiro
    Tomita, Shuichiro
    MOLECULAR PHYLOGENETICS AND EVOLUTION, 2011, 58 (01) : 43 - 52
  • [40] Rate matrices for analyzing large families of protein sequences
    Devauchelle, C
    Grossmann, A
    Hénaut, A
    Holschneider, M
    Monnerot, M
    Risler, JL
    Torrésani, B
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2001, 8 (04) : 381 - 399