Parametric and nonparametric two-sample tests for feature screening in class comparison: a simulation study

被引:3
|
作者
Landoni, Elena [1 ]
Ambrogi, Federico [2 ]
Mariani, Luigi [1 ]
Miceli, Rosalba [1 ]
机构
[1] Fdn IRCCS Ist Nazl Tumori, Milan, Italy
[2] Univ Milan, Milan, Italy
关键词
high-dimensional data; class comparison; location-scale problem; general two-sample problem; mixtures;
D O I
10.2427/11808
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Background: The identification of a location-, scale-and shape-sensitive test to detect differentially expressed features between two comparison groups represents a key point in high dimensional studies. The most commonly used tests refer to differences in location, but general distributional discrepancies might be important to reveal differential biological processes. Methods: A simulation study was conducted to compare the performance of a set of two-sample tests, i.e. Student's t, Welch's t, Wilcoxon-Mann-Whitney (WMW), Podgor-Gastwirth PG2, Cucconi, Kolmogorov-Smirnov (KS), Cramer-von Mises (CvM), Anderson-Darling (AD) and Zhang tests (Z(K), Z(C) and Z(A)) which were investigated under different distributional patterns. We applied the same tests to a real data example. Results: AD, CvM, Z(A) and Z(C) tests proved to be the most sensitive tests in mixture distribution patterns, while still maintaining a high power in normal distribution patterns. At best, the AD test showed a power loss of similar to 2% in the comparison of two normal distributions, but a gain of similar to 32% with mixture distributions with respect to the parametric tests. Accordingly, the AD test detected the greatest number of differentially expressed features in the real data application. Conclusion: The tests for the general two-sample problem introduce a more general concept of 'differential expression', thus overcoming the limitations of the other tests restricted to specific moments of the feature distributions. In particular, the AD test should be considered as a powerful alternative to the parametric tests for feature screening in order to keep as many discriminative features as possible for the class prediction analysis.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Nonparametric Two-Sample Tests of High Dimensional Mean Vectors via Random Integration
    Jiang, Yunlu
    Wang, Xueqin
    Wen, Canhong
    Jiang, Yukang
    Zhang, Heping
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (545) : 701 - 714
  • [42] Robust multivariate nonparametric tests for detection of two-sample location shift in clinical trials
    Jiang, Xuejun
    Guo, Xu
    Zhang, Ning
    Wang, Bo
    Zhang, Bo
    PLOS ONE, 2018, 13 (04):
  • [43] Characterising transitive two-sample tests
    Lumley, Thomas
    Gillen, Daniel L.
    STATISTICS & PROBABILITY LETTERS, 2016, 109 : 118 - 123
  • [44] Generalized kernel two-sample tests
    Song, Hoseung
    Chen, Hao
    BIOMETRIKA, 2024, 111 (03) : 755 - 770
  • [45] Neutralise: An open science initiative for neutral comparison of two-sample tests
    Kodalci, Leyla
    Thas, Olivier
    BIOMETRICAL JOURNAL, 2024, 66 (01)
  • [46] Efficient and adaptive nonparametric test for the two-sample problem
    Ducharme, GR
    Ledwina, T
    ANNALS OF STATISTICS, 2003, 31 (06): : 2036 - 2058
  • [47] Proposed nonparametric test for the mixed two-sample design
    Magel R.C.
    Fu R.
    Journal of Statistical Theory and Practice, 2014, 8 (2) : 221 - 237
  • [48] A ROBUST AND NONPARAMETRIC TWO-SAMPLE TEST IN HIGH DIMENSIONS
    Qiu, Tao
    Xu, Wangli
    Zhu, Liping
    STATISTICA SINICA, 2021, 31 (04) : 1853 - 1869
  • [49] Nonparametric two-sample comparisons of changes on ordinal responses
    Bajorski, P
    Petkau, J
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1999, 94 (447) : 970 - 978
  • [50] Two-sample nonparametric test for proportional reversed hazards
    Khan, Ruhul Ali
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2023, 182