TreeHugger: A New Test for Enrichment of Gene Ontology Terms

被引:4
|
作者
Jupiter, Daniel [1 ]
Sahutoglu, Jessica [2 ]
VanBuren, Vincent [1 ]
机构
[1] Texas A&M Hlth Sci Ctr, Dept Syst Biol & Translat Med, Coll Med, Temple, TX 76504 USA
[2] Washtenaw Community Hlth Org, Ypsilanti, MI 48198 USA
关键词
statistics; data analysis; probability; genomics; microarray; EXPRESSION PROFILES; ART; TOOL; ANNOTATION; CATEGORIES; GRAPH; SETS;
D O I
10.1287/ijoc.1090.0356
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
T he Gene Ontology (GO) project provides a structured vocabulary of biological terms used by biological researchers as a tool for standardization of references to biological entities. Genes may be annotated with GO terms to indicate their roles or localizations in the cell. GO has been used in conjunction with high-throughput experimental methods, such as microarrays. In this setting, the interest is to determine whether sets of genes identified by the high-throughput experiment are enriched for GO terms: Do certain terms annotate more genes in the identified set than one might expect? Enriched terms are taken as a potential summary of the cellular function for the identified set of genes and may provide clues leading to new directions for investigation. Current methods for determining whether sets of genes are GO-enriched have certain well-known shortcomings. Many methods do not take the hierarchical structure of the ontology into account in determining enrichment. We address this drawback by introducing a new statistical test (TreeHugger) based on a novel per-gene scoring scheme for GO terms. Given a set of genes and a specified subset of those genes, our method determines enrichment of GO terms in the subset, taking into account the structure of the ontology and ascribing a lower weight to those terms that do not themselves directly annotate the given genes. Tests on simulated and real data indicate that our method is a conservative test for enrichment. Testing TreeHugger on a biological example reveals that it also reduces the redundancy caused by giving high scores to indirect annotations as provided by standard enrichment tests.
引用
收藏
页码:210 / 221
页数:12
相关论文
共 50 条
  • [31] Gene Ontology Enrichment Improves Performances of Functional Similarity of Genes
    Liu, Wenting
    Liu, Jianjun
    Rajapakse, Jagath C.
    SCIENTIFIC REPORTS, 2018, 8
  • [32] Approaching the axiomatic enrichment of the Gene Ontology from a lexical perspective
    Quesada-Martinez, Manuel
    Mikroyannidi, Eleni
    Tomas Fernandez-Breis, Jesualdo
    Stevens, Robert
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2015, 65 (01) : 35 - 48
  • [33] Annotation of gene products in the literature with gene ontology terms using syntactic dependencies
    Kim, JJ
    Park, JC
    NATURAL LANGUAGE PROCESSING - IJCNLP 2004, 2005, 3248 : 787 - 796
  • [34] Summary Visualizations of Gene Ontology Terms With GO-Figure!
    Reijnders, Maarten J. M. F.
    Waterhouse, Robert M.
    FRONTIERS IN BIOINFORMATICS, 2021, 1
  • [35] CirGO: an alternative circular way of visualising gene ontology terms
    Irina Kuznetsova
    Artur Lugmayr
    Stefan J. Siira
    Oliver Rackham
    Aleksandra Filipovska
    BMC Bioinformatics, 20
  • [36] CirGO: an alternative circular way of visualising gene ontology terms
    Kuznetsova, Irina
    Lugmayr, Artur
    Siira, Stefan J.
    Rackham, Oliver
    Filipovska, Aleksandra
    BMC BIOINFORMATICS, 2019, 20 (1)
  • [37] GOLink: Finding Cooccurring Terms across Gene Ontology Namespaces
    Francis, Richard W.
    INTERNATIONAL JOURNAL OF GENOMICS, 2013, 2013
  • [38] An experimental study of information content measurement of gene ontology terms
    Milano, Marianna
    Agapito, Giuseppe
    Guzzi, Pietro H.
    Cannataro, Mario
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2018, 9 (03) : 427 - 439
  • [40] Automatic annotation of protein motif function with Gene Ontology terms
    Lu, XH
    Zhai, CX
    Gopalakrishnan, V
    Buchanan, BG
    BMC BIOINFORMATICS, 2004, 5 (1)