Novel function discovery through sequence and structural data mining

被引:26
|
作者
Lobb, Briallen [1 ]
Doxey, Andrew C. [1 ]
机构
[1] Univ Waterloo, Dept Biol, 200 Univ Ave West, Waterloo, ON N2L 3G1, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
PROTEIN-PROTEIN INTERACTIONS; LARGE-SCALE; LINEAR MOTIFS; COMPLETE NITRIFICATION; STRUCTURE PREDICTION; ENZYME; EVOLUTION; SPECIFICITY; BACTERIA; SURFACE;
D O I
10.1016/j.sbi.2016.05.017
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Large-scale sequence and structural data is a goldmine of novel proteins, but how can this data be effectively mined for new functions? Here, we review protein function prediction methods and recent studies that apply these methods to discover new functionality. Core approaches include sequence-based homology detection, phylogenetic analysis, structural bioinformatics, and inference of functional associations using genomic context and related methods. With such a wide range of approaches, sequences may reveal new functionality regardless of their similarity to a characterized reference. Homologs of known function may be identified in unexpected species or associations. Detection of functional shifts in sequences may reveal new activities and specificities. New protein functions may also be predicted in uncharacterized sequences and structures. Finally, methods and data may be integrated and applied at increasingly large scales due to improved protein domain knowledge and structural coverage, which amplifies the ability to predict and discover novel protein functions.
引用
收藏
页码:53 / 61
页数:9
相关论文
共 50 条
  • [31] Classifying Lung Neuroendocrine Neoplasms through MicroRNA Sequence Data Mining
    Wong, Justin J. M.
    Ginter, Paula S.
    Tyryshkin, Kathrin
    Yang, Xiaojing
    Nanayakkara, Jina
    Zhou, Zier
    Tuschl, Thomas
    Chen, Yao-Tseng
    Renwick, Neil
    CANCERS, 2020, 12 (09) : 1 - 12
  • [32] Investigative mining of sequence data for novel enzymes: A case study with nitrilases
    Seffernick, Jennifer L.
    Samanta, Sudip K.
    Louie, Tai Man
    Wackett, Lawrence P.
    Subramanian, Mani
    JOURNAL OF BIOTECHNOLOGY, 2009, 143 (01) : 17 - 26
  • [33] Mining Periodic Changes in Complex Dynamic Data Through Relational Pattern Discovery
    Loglisci, Corrado
    Malerba, Donato
    NEW FRONTIERS IN MINING COMPLEX PATTERNS, 2016, 9607 : 76 - 90
  • [34] TACTICAL ANALYSIS MODELING THROUGH DATA MINING Pattern Discovery in Racket Sports
    Terroba Acha, Antonio
    Kosters, Walter A.
    Vis, Jonathan K.
    KDIR 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2010, : 176 - 181
  • [35] Discovery of User Navigation Patterns on a web site through Data Mining Algorithms
    Revathy, P.
    Ramani, R. Geetha
    Jacob, Shomona Gracia
    Nancy, P.
    2012 INTERNATIONAL CONFERENCE ON FUTURE COMMUNICATION AND COMPUTER TECHNOLOGY (ICFCCT 2012), 2012, : 167 - 172
  • [36] Contrast set mining through subgroup discovery applied to brain ischaemina data
    Kralj, Petra
    Lavrac, Nada
    Gamberger, Dragan
    Krstacic, Antonija
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 579 - +
  • [37] Discovery of knowledge on incidence of cancer type for terminal patients through data mining
    Gutierrez Perez, Luis Antonio
    Cruz, Doricela Gutierrez
    Molina, Ricardo Rico
    Rodriguez Paez, Carmen Liliana
    Albarran Fernandez, Yaroslaf Aaron
    Rivera, Bernardo Soto
    Gutierrez Cruz, Alma Rebeca
    Duran Lopez, Victor Manuel
    CIENCIA ERGO-SUM, 2021, 28 (01)
  • [38] Knowledge discovery from models of soil properties developed through data mining
    Bui, EN
    Henderson, BL
    Viergever, K
    ECOLOGICAL MODELLING, 2006, 191 (3-4) : 431 - 446
  • [39] Comparison of sequence and structure-based datasets for nonredundant structural data mining
    Chu, CK
    Feng, LL
    Wouters, MA
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 60 (04) : 577 - 583
  • [40] A novel switching function approach for data mining classification problems
    Ibrahim, Mohammed Hussein
    Hacibeyoglu, Mehmet
    SOFT COMPUTING, 2020, 24 (07) : 4941 - 4957