Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity

被引:65
|
作者
Kadota, Koji [1 ]
Nakai, Yuji [1 ]
Shimizu, Kentaro [1 ]
机构
[1] Univ Tokyo, Grad Sch Agr & Life Sci, Bunkyo Ku, Tokyo 1138657, Japan
来源
关键词
PROBE LEVEL DATA; OLIGONUCLEOTIDE ARRAYS; MICROARRAY EXPERIMENTS; HOOK-CALIBRATION; MODEL; BIOCONDUCTOR; ACCURACY;
D O I
10.1186/1748-7188-4-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: To identify differentially expressed genes (DEGs) from microarray data, users of the Affymetrix GeneChip system need to select both a preprocessing algorithm to obtain expression-level measurements and a way of ranking genes to obtain the most plausible candidates. We recently recommended suitable combinations of a preprocessing algorithm and gene ranking method that can be used to identify DEGs with a higher level of sensitivity and specificity. However, in addition to these recommendations, researchers also want to know which combinations enhance reproducibility. Results: We compared eight conventional methods for ranking genes: weighted average difference (WAD), average difference (AD), fold change (FC), rank products (RP), moderated t statistic (modT), significance analysis of microarrays (samT), shrinkage t statistic (shrinkT), and intensity-based moderated t statistic (ibmT) with six preprocessing algorithms (PLIER, VSN, FARMS, multi-mgMOS (mmgMOS), MBEI, and GCRMA). A total of 36 real experimental datasets was evaluated on the basis of the area under the receiver operating characteristic curve (AUC) as a measure for both sensitivity and specificity. We found that the RP method performed well for VSN-, FARMS-, MBEI-, and GCRMA- preprocessed data, and the WAD method performed well for mmgMOS-preprocessed data. Our analysis of the MicroArray Quality Control (MAQC) project's datasets showed that the FC-based gene ranking methods (WAD, AD, FC, and RP) had a higher level of reproducibility: The percentages of overlapping genes (POGs) across different sites for the FC-based methods were higher overall than those for the t-statistic-based methods (modT, samT, shrinkT, and ibmT). In particular, POG values for WAD were the highest overall among the FC-based methods irrespective of the choice of preprocessing algorithm. Conclusion: Our results demonstrate that to increase sensitivity, specificity, and reproducibility in microarray analyses, we need to select suitable combinations of preprocessing algorithms and gene ranking methods. We recommend the use of FC-based methods, in particular RP or WAD.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data
    Feng, Jianxing
    Meyer, Clifford A.
    Wang, Qian
    Liu, Jun S.
    Liu, X. Shirley
    Zhang, Yong
    BIOINFORMATICS, 2012, 28 (21) : 2782 - 2788
  • [22] An empirical CDF approach to estimate the significance of gene ranking for finding differentially expressed genes
    Shaik, J.
    George, E. O.
    Yeasin, M.
    PROCEEDINGS OF THE 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, VOLS I AND II, 2007, : 1071 - +
  • [23] Fisher's combined p-value for detecting differentially expressed genes using Affymetrix expression arrays
    Ann Hess
    Hari Iyer
    BMC Genomics, 8
  • [24] Fisher's combined p-value for detecting differentially expressed genes using Affymetrix expression arrays
    Hess, Ann
    Iyer, Hari
    BMC GENOMICS, 2007, 8
  • [25] Methods for evaluating gene expression from Affymetrix microarray datasets
    Jiang, Ning
    Leach, Lindsey J.
    Hu, Xiaohua
    Potokina, Elena
    Jia, Tianye
    Druka, Arnis
    Waugh, Robbie
    Kearsey, Michael J.
    Luo, Zewei W.
    BMC BIOINFORMATICS, 2008, 9 (1)
  • [26] Methods for evaluating gene expression from Affymetrix microarray datasets
    Ning Jiang
    Lindsey J Leach
    Xiaohua Hu
    Elena Potokina
    Tianye Jia
    Arnis Druka
    Robbie Waugh
    Michael J Kearsey
    Zewei W Luo
    BMC Bioinformatics, 9
  • [27] Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data
    Jeffery, Ian B.
    Higgins, Desmond G.
    Culhane, Aedin C.
    BMC BIOINFORMATICS, 2006, 7 (1)
  • [28] Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data
    Ian B Jeffery
    Desmond G Higgins
    Aedín C Culhane
    BMC Bioinformatics, 7
  • [29] Bootstrapping of gene-expression data improves and controls the false discovery rate of differentially expressed genes
    Theo HE Meuwissen
    Mike E Goddard
    Genetics Selection Evolution, 36 (2)
  • [30] Analysis of differentially expressed genes in individuals with noninfectious uveitis based on data in the gene expression omnibus database
    Zhang, Dandan
    Zhang, Ning
    Wang, Yan
    Zhang, Qian
    Wang, Jiadi
    Yao, Jing
    MEDICINE, 2022, 101 (41) : E31082