Identifying Differentially Expressed Genes in RNA Sequencing Data with Small Labelled Samples

被引:0
|
作者
Guo Y. [1 ]
Xiao Y. [1 ]
Li L. [1 ]
机构
[1] Xi'an Jiaotong University, School of Mathematics and Statistics, Xi'an,710049, China
关键词
Auxiliary sample; Biological system modeling; Biology; Cancer; Differentially expressed genes; Gene expression; Sequential analysis; Small sample problem; Sociology; Statistics; Two-sample independent test; Wilcoxon-Mann-Whitney test;
D O I
10.1109/TCBB.2024.3382147
中图分类号
学科分类号
摘要
RNA-seq, including bulk RNA-seq and single-cell RNA-seq, is a next-generation sequencing-based RNA profiling method capable of measuring gene expression patterns with high resolution, and has gradually become an essential tool for the analysis of differential gene expression at the whole transcriptome level. Differential gene identification is a key problem in many biological studies such as disease genetics. Two-sample location test methods are widely used in case-control studies to identify the significant differential genes. However, due to the high cost of labelled data collection, many studies face the small sample problem since there is only small labelled data available, for which the traditional methods often lose power. To address this issue, we propose a novel rank-based nonparametric test method called WMW-A test based on <underline>W</underline>ilcoxon-<underline>M</underline>ann-<underline>W</underline>hitiney test by introducing a three-sample statistic through another <underline>a</underline>uxiliary sample, which is either given or generated in form of unlabelled data. By combining the case, control and auxiliary samples together, we construct a three-sample WMW-A statistic based on the gap between the average ranks of the case and control samples in the combined samples. The extensive simulation experiments and real applications on different gene expression datasets, including one bulk RNA-seq dataset and two single cell RNA-seq datasets, show that the WMW-A test could significantly improve the test power for two-sample problem with small sample sizes, by either available or generated auxiliary data. The applications on two real small SARS-CoV-2 datasets further show the improvement of WMW-A test for differentially expressed gene identification with small labelled samples. IEEE
引用
收藏
页码:1 / 12
页数:11
相关论文
共 50 条
  • [21] RNA Sequencing of Decidua Reveals Differentially Expressed Genes in Recurrent Pregnancy Loss
    Yuehan Li
    Renjie Wang
    Meng Wang
    Weiming Huang
    Chang Liu
    Zishui Fang
    Shujie Liao
    Lei Jin
    Reproductive Sciences, 2021, 28 : 2261 - 2269
  • [22] Identification of Differentially Expressed and Spliced Genes with RNA Sequencing Analysis of CLL Specimens
    Liao, Wei
    Jordaan, Gwen
    Jaroszewicz, Artur
    Pellegrini, Matteo
    Sharma, Sanjai
    BLOOD, 2012, 120 (21)
  • [23] Small RNA Sequencing Reveals Differentially Expressed miRNAs in Necrotizing Enterocolitis in Rats
    Yu, Ren-qiang
    Wang, Min
    Jiang, Shan-yu
    Zhang, Ying-hui
    Zhou, Xiao-yu
    Zhou, Qin
    BIOMED RESEARCH INTERNATIONAL, 2020, 2020
  • [24] Identifying differentially expressed genes for ordinal phenotypes
    Kim, Yongkang
    Park, Taesung
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
  • [25] A new framework for identifying differentially expressed genes
    Li, Jie
    Tang, Xianglong
    Zhao, Wei
    Huang, Jianhua
    PATTERN RECOGNITION, 2007, 40 (11) : 3249 - 3262
  • [26] Ranking analysis for identifying differentially expressed genes
    Qi, Yunsong
    Sun, Huaijiang
    Sun, Quansen
    Pan, Lei
    GENOMICS, 2011, 97 (05) : 326 - 329
  • [27] Differentially expressed genes in small intestine and colon adenocarcinomas identified by transcriptome sequencing
    Jung, Seung Hyun
    Choi, Eun Ji
    Lee, Sug Hyung
    An, Chang Hyeok
    PATHOLOGY RESEARCH AND PRACTICE, 2020, 216 (04)
  • [28] Identification and isolation of differentially expressed genes front very small tissue samples
    Gonzalez, P
    Zigler, JS
    Epstein, DL
    Borrás, T
    BIOTECHNIQUES, 1999, 26 (05) : 884 - +
  • [29] Permutation-Based Test with Small Samples for Detecting Differentially Expressed Genes
    Lee, Ju-Hyoung
    Song, Hae-Hiang
    KOREAN JOURNAL OF APPLIED STATISTICS, 2009, 22 (05) : 1059 - 1072
  • [30] Robustness of single-cell RNA-seq for identifying differentially expressed genes
    Yong Liu
    Jing Huang
    Rajan Pandey
    Pengyuan Liu
    Bhavika Therani
    Qiongzi Qiu
    Sridhar Rao
    Aron M. Geurts
    Allen W. Cowley
    Andrew S. Greene
    Mingyu Liang
    BMC Genomics, 24