Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity

被引:65
|
作者
Kadota, Koji [1 ]
Nakai, Yuji [1 ]
Shimizu, Kentaro [1 ]
机构
[1] Univ Tokyo, Grad Sch Agr & Life Sci, Bunkyo Ku, Tokyo 1138657, Japan
来源
关键词
PROBE LEVEL DATA; OLIGONUCLEOTIDE ARRAYS; MICROARRAY EXPERIMENTS; HOOK-CALIBRATION; MODEL; BIOCONDUCTOR; ACCURACY;
D O I
10.1186/1748-7188-4-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: To identify differentially expressed genes (DEGs) from microarray data, users of the Affymetrix GeneChip system need to select both a preprocessing algorithm to obtain expression-level measurements and a way of ranking genes to obtain the most plausible candidates. We recently recommended suitable combinations of a preprocessing algorithm and gene ranking method that can be used to identify DEGs with a higher level of sensitivity and specificity. However, in addition to these recommendations, researchers also want to know which combinations enhance reproducibility. Results: We compared eight conventional methods for ranking genes: weighted average difference (WAD), average difference (AD), fold change (FC), rank products (RP), moderated t statistic (modT), significance analysis of microarrays (samT), shrinkage t statistic (shrinkT), and intensity-based moderated t statistic (ibmT) with six preprocessing algorithms (PLIER, VSN, FARMS, multi-mgMOS (mmgMOS), MBEI, and GCRMA). A total of 36 real experimental datasets was evaluated on the basis of the area under the receiver operating characteristic curve (AUC) as a measure for both sensitivity and specificity. We found that the RP method performed well for VSN-, FARMS-, MBEI-, and GCRMA- preprocessed data, and the WAD method performed well for mmgMOS-preprocessed data. Our analysis of the MicroArray Quality Control (MAQC) project's datasets showed that the FC-based gene ranking methods (WAD, AD, FC, and RP) had a higher level of reproducibility: The percentages of overlapping genes (POGs) across different sites for the FC-based methods were higher overall than those for the t-statistic-based methods (modT, samT, shrinkT, and ibmT). In particular, POG values for WAD were the highest overall among the FC-based methods irrespective of the choice of preprocessing algorithm. Conclusion: Our results demonstrate that to increase sensitivity, specificity, and reproducibility in microarray analyses, we need to select suitable combinations of preprocessing algorithms and gene ranking methods. We recommend the use of FC-based methods, in particular RP or WAD.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity
    Koji Kadota
    Yuji Nakai
    Kentaro Shimizu
    Algorithms for Molecular Biology, 4
  • [2] The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies
    Leming Shi
    Wendell D Jones
    Roderick V Jensen
    Stephen C Harris
    Roger G Perkins
    Federico M Goodsaid
    Lei Guo
    Lisa J Croner
    Cecilie Boysen
    Hong Fang
    Feng Qian
    Shashi Amur
    Wenjun Bao
    Catalin C Barbacioru
    Vincent Bertholet
    Xiaoxi Megan Cao
    Tzu-Ming Chu
    Patrick J Collins
    Xiao-hui Fan
    Felix W Frueh
    James C Fuscoe
    Xu Guo
    Jing Han
    Damir Herman
    Huixiao Hong
    Ernest S Kawasaki
    Quan-Zhen Li
    Yuling Luo
    Yunqing Ma
    Nan Mei
    Ron L Peterson
    Raj K Puri
    Richard Shippy
    Zhenqiang Su
    Yongming Andrew Sun
    Hongmei Sun
    Brett Thorn
    Yaron Turpaz
    Charles Wang
    Sue Jane Wang
    Janet A Warrington
    James C Willey
    Jie Wu
    Qian Xie
    Liang Zhang
    Lu Zhang
    Sheng Zhong
    Russell D Wolfinger
    Weida Tong
    BMC Bioinformatics, 9
  • [3] The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies
    Shi, Leming
    Jones, Wendell D.
    Jensen, Roderick V.
    Harris, Stephen C.
    Perkins, Roger G.
    Goodsaid, Federico M.
    Guo, Lei
    Croner, Lisa J.
    Boysen, Cecilie
    Fang, Hong
    Qian, Feng
    Amur, Shashi
    Bao, Wenjun
    Barbacioru, Catalin C.
    Bertholet, Vincent
    Cao, Xiaoxi Megan
    Chu, Tzu-Ming
    Collins, Patrick J.
    Fan, Xiaohui
    Frueh, Felix W.
    Fuscoe, James C.
    Guo, Xu
    Han, Jing
    Herman, Damir
    Hong, Huixiao
    Kawasaki, Ernest S.
    Li, Quan-Zhen
    Luo, Yuling
    Ma, Yunqing
    Mei, Nan
    Peterson, Ron L.
    Puri, Raj K.
    Shippy, Richard
    Su, Zhenqiang
    Sun, Yongming Andrew
    Sun, Hongmei
    Thorn, Brett
    Turpaz, Yaron
    Wang, Charles
    Wang, Sue Jane
    Warrington, Janet A.
    Willey, James C.
    Wu, Jie
    Xie, Qian
    Zhang, Liang
    Zhang, Lu
    Zhong, Sheng
    Wolfinger, Russell D.
    Tong, Weida
    BMC BIOINFORMATICS, 2008, 9 (Suppl 9)
  • [4] Statistical methods for ranking differentially expressed genes
    Per Broberg
    Genome Biology, 4
  • [5] Statistical methods for ranking differentially expressed genes
    Broberg, P
    GENOME BIOLOGY, 2003, 4 (06)
  • [6] On the identification of differentially expressed genes:: Improving the generalized F-statistics for Affymetrix microarray gene expression data
    Lai, Yinglei
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2006, 30 (05) : 321 - 326
  • [7] ISOLATION OF CDNAS FROM DIFFERENTIALLY EXPRESSED GENES BY ARBITRARY RT-PCR - OPTIMIZATION AND ANALYSIS OF SPECIFICITY, SENSITIVITY AND REPRODUCIBILITY
    HERNANDEZ, I
    SOKOLOV, BP
    AMERICAN JOURNAL OF HUMAN GENETICS, 1995, 57 (04) : 1820 - 1820
  • [8] Evaluating methods for ranking differentially expressed genes applied to microArray quality control data
    Koji Kadota
    Kentaro Shimizu
    BMC Bioinformatics, 12
  • [9] Evaluating methods for ranking differentially expressed genes applied to microArray quality control data
    Kadota, Koji
    Shimizu, Kentaro
    BMC BIOINFORMATICS, 2011, 12
  • [10] Generation of patterns from gene expression data by assigning confidence to differentially expressed genes
    Manduchi, E
    Grant, GR
    McKenzie, SE
    Overton, GC
    Surrey, S
    Stoeckert, CJ
    BIOINFORMATICS, 2000, 16 (08) : 685 - 698