GeneSetCluster: a tool for summarizing and integrating gene-set analysis results

被引:8
|
作者
Ewing, Ewoud [1 ]
Planell-Picola, Nuria [2 ]
Jagodic, Maja [3 ]
Gomez-Cabrero, David [2 ,3 ]
机构
[1] Karolinska Inst, Ctr Mol Med, Dept Clin Neurosci, S-17177 Stockholm, Sweden
[2] Univ Publ Navarra UPNA, Complejo Hosp Navarra CHN, Navarrabiomed, Translat Bioinformat Unit,IdiSNA, Pamplona, Spain
[3] Karolinska Inst, Ctr Mol Med, Dept Med, Unit Computat Med, S-17177 Stockholm, Sweden
基金
瑞典研究理事会;
关键词
Data-mining; Gene-set enrichment; Clustering pathways; Overlapping pathways; Clustering gene-sets; ENRICHMENT; PACKAGE;
D O I
10.1186/s12859-020-03784-z
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundGene-set analysis tools, which make use of curated sets of molecules grouped based on their shared functions, aim to identify which gene-sets are over-represented in the set of features that have been associated with a given trait of interest. Such tools are frequently used in gene-centric approaches derived from RNA-sequencing or microarrays such as Ingenuity or GSEA, but they have also been adapted for interval-based analysis derived from DNA methylation or ChIP/ATAC-sequencing. Gene-set analysis tools return, as a result, a list of significant gene-sets. However, while these results are useful for the researcher in the identification of major biological insights, they may be complex to interpret because many gene-sets have largely overlapping gene contents. Additionally, in many cases the result of gene-set analysis consists of a large number of gene-sets making it complicated to identify the major biological insights.ResultsWe present GeneSetCluster, a novel approach which allows clustering of identified gene-sets, from one or multiple experiments and/or tools, based on shared genes. GeneSetCluster calculates a distance score based on overlapping gene content, which is then used to cluster them together and as a result, GeneSetCluster identifies groups of gene-sets with similar gene-set definitions (i.e. gene content). These groups of gene-sets can aid the researcher to focus on such groups for biological interpretations.ConclusionsGeneSetCluster is a novel approach for grouping together post gene-set analysis results based on overlapping gene content. GeneSetCluster is implemented as a package in R. The package and the vignette can be downloaded at https://github.com/TranslationalBioinformaticsUnit
引用
收藏
页数:7
相关论文
共 50 条
  • [21] A study on alternatives to the permutation test in gene-set analysis
    Lee, Sunho
    KOREAN JOURNAL OF APPLIED STATISTICS, 2018, 31 (02) : 241 - 251
  • [22] Investigating the effect of paralogs on microarray gene-set analysis
    Andre J Faure
    Cathal Seoighe
    Nicola J Mulder
    BMC Bioinformatics, 12
  • [23] Pitfalls in the application of gene-set analysis to genetics studies
    Sedeno-Cortes, Adriana Estela
    Pavlidis, Paul
    TRENDS IN GENETICS, 2014, 30 (12) : 513 - 514
  • [24] De-correlating expression in gene-set analysis
    Nam, Dougu
    BIOINFORMATICS, 2010, 26 (18) : i511 - i516
  • [25] Investigating the effect of paralogs on microarray gene-set analysis
    Faure, Andre J.
    Seoighe, Cathal
    Mulder, Nicola J.
    BMC BIOINFORMATICS, 2011, 12
  • [26] Network enrichment analysis: extension of gene-set enrichment analysis to gene networks
    Andrey Alexeyenko
    Woojoo Lee
    Maria Pernemalm
    Justin Guegan
    Philippe Dessen
    Vladimir Lazar
    Janne Lehtiö
    Yudi Pawitan
    BMC Bioinformatics, 13
  • [27] Network enrichment analysis: extension of gene-set enrichment analysis to gene networks
    Alexeyenko, Andrey
    Lee, Woojoo
    Pernemalm, Maria
    Guegan, Justin
    Dessen, Philippe
    Lazar, Vladimir
    Lehtio, Janne
    Pawitan, Yudi
    BMC BIOINFORMATICS, 2012, 13
  • [28] Application of the parametric bootstrap for gene-set analysis of gene–environment interactions
    Brandon J. Coombes
    Joanna M. Biernacka
    European Journal of Human Genetics, 2018, 26 : 1679 - 1686
  • [29] Gene-set activity toolbox (GAT): A platform for microarray-based cancer diagnosis using an integrative gene-set analysis approach
    Engchuan, Worrawat
    Meechai, Asawin
    Tongsima, Sissades
    Doungpan, Narumol
    Chan, Jonathan H.
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2016, 14 (04)
  • [30] DOT: Gene-set analysis by combining decorrelated association statistics
    Vsevolozhskaya, Olga A.
    Shi Min
    Hu Fengjiao
    Zaykin, Dmitri V.
    PLOS COMPUTATIONAL BIOLOGY, 2020, 16 (04)