Gene-set distance analysis (GSDA): a powerful tool for gene-set association analysis

被引:0
|
作者
Cao, Xueyuan [1 ]
Pounds, Stan [2 ]
机构
[1] Univ Tennessee, Hlth Sci Ctr, Dept Acute & Tertiary Care, Memphis, TN 38163 USA
[2] St Jude Childrens Res Hosp, Dept Biostat, 332 N Lauderdale St, Memphis, TN 38105 USA
关键词
Gene profiling; Gene set; Distance correlation; ACUTE MYELOID-LEUKEMIA; FALSE DISCOVERY RATE; FUNCTIONAL CATEGORIES; ENRICHMENT ANALYSIS; EXPRESSION; MICROARRAY;
D O I
10.1186/s12859-021-04110-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Identifying sets of related genes (gene sets) that are empirically associated with a treatment or phenotype often yields valuable biological insights. Several methods effectively identify gene sets in which individual genes have simple monotonic relationships with categorical, quantitative, or censored event-time variables. Some distance-based methods, such as distance correlations, may detect complex non-monotone associations of a gene-set with a quantitative variable that elude other methods. However, the distance correlations have yet to be generalized to associate gene-sets with categorical and censored event-time endpoints. Also, there is a need to determine which genes empirically drive the significance of an association of a gene set with an endpoint. Results: We develop gene-set distance analysis (GSDA) by generalizing distance correlations to evaluate the association of a gene set with categorical and censored event-time variables. We also develop a backward elimination procedure to identify a subset of genes that empirically drive significant associations. In simulation studies, GSDA more effectively identified complex non-monotone gene-set associations than did six other published methods. In the analysis of a pediatric acute myeloid leukemia (AML) data set, GSDA was the only method to discover that event-free survival (EFS) was associated with the 56-gene AML pathway gene-set, narrow that result down to 5 genes, and confirm the association of those 5 genes with EFS in a separate validation cohort. These results indicate that GSDA effectively identifies and characterizes complex non-monotonic gene-set associations that are missed by other methods. Conclusion: GSDA is a powerful and flexible method to detect gene-set association with categorical, quantitative, or censored event-time variables, especially to detect complex non-monotonic gene-set associations. Available at https://CRAN.R-project.org/package=GSDA..
引用
收藏
页数:22
相关论文
共 50 条
  • [41] Gene-set Cohesion Analysis Tool (GCAT): a Literature Based Web Tool for Calculating Functional Cohesiveness of Gene Groups
    Xu, Lijing
    Homayouni, Ramin
    Furlotte, Nicholas A.
    Heinrich, Kevin E.
    George, E. Olusegun
    Berry, Michael W.
    BIBMW: 2009 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOP, 2009, : 342 - 342
  • [42] Incorporating regulatory interactions into gene-set analyses for GWAS data: A controlled analysis with the MAGMA tool
    Groenewoud, David
    Shye, Avinoam
    Elkon, Ran
    PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (03)
  • [43] Gene-set analysis identifies master transcription factors in developmental courses
    Liu, Ying
    Jiang, Bo
    Zhang, Xuegong
    GENOMICS, 2009, 94 (01) : 1 - 10
  • [44] Gene-set Analysis with CGI Information for Differential DNA Methylation Profiling
    Chia-Wei Chang
    Tzu-Pin Lu
    Chang-Xian She
    Yen-Chen Feng
    Chuhsing Kate Hsiao
    Scientific Reports, 6
  • [45] Gene-set Enrichment with Mathematical Biology (GEMB)
    Cochran, Amy L.
    Nieser, Kenneth J.
    Forger, Daniel B.
    Zollner, Sebastian
    McInnis, Melvin G.
    GIGASCIENCE, 2020, 9 (10):
  • [46] Gene Mapping and Gene-Set Analysis for Milk Fever Incidence in Holstein Dairy Cattle
    Pacheco, Hendyel A.
    da Silva, Simone
    Sigdel, Anil
    Mak, Chun Kuen
    Galvao, Klibs N.
    Texeira, Rodrigo A.
    Dias, Laila T.
    Penagaricano, Francisco
    FRONTIERS IN GENETICS, 2018, 9
  • [47] RANDOM-SET METHODS IDENTIFY DISTINCT ASPECTS OF THE ENRICHMENT SIGNAL IN GENE-SET ANALYSIS
    Newton, Michael A.
    Quintana, Fernando A.
    Den Boon, Johan A.
    Sengupta, Srikumar
    Ahlquist, Paui
    ANNALS OF APPLIED STATISTICS, 2007, 1 (01): : 85 - 106
  • [48] GENE AND GENE-SET ANALYSIS REVEALS 10 GENES AND 24 FUNCTIONAL PATHWAYS FOR OSTEOMYELITIS
    Tian, W.
    Yao, S.
    Guo, Y.
    OSTEOPOROSIS INTERNATIONAL, 2020, 31 (SUPPL 1) : S295 - S296
  • [49] USING GENE-SET ANALYSIS TO GAIN BIOLOGICAL KNOWLEDGE BASED ON GWAS RESULTS
    Posthuma, Danielle
    de Leeuw, Christiaan
    EUROPEAN NEUROPSYCHOPHARMACOLOGY, 2019, 29 : S728 - S729
  • [50] GENE-SET ANALYSIS OF THE SAGE DATA IDENTIFIES PATHWAYS CONTRIBUTING TO ALCOHOL DEPENDENCE
    Biernacka, J. M.
    Johnson, J. R.
    Rider, D. N.
    Colby, C. L.
    Jenkins, G.
    Karpyak, V. M.
    Fridley, B. L.
    ALCOHOLISM-CLINICAL AND EXPERIMENTAL RESEARCH, 2010, 34 (08) : 114A - 114A