Deep Neural Networks for High Dimension, Low Sample Size Data

被引:0
|
作者
Liu, Bo [1 ]
Wei, Ying [1 ]
Zhang, Yu [1 ]
Yang, Qiang [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
FEATURE-SELECTION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNN) have achieved breakthroughs in applications with large sample size. However, when facing high dimension, low sample size (HDLSS) data, such as the phenotype prediction problem using genetic data in bioinformatics, DNN suffers from overfitting and high-variance gradients. In this paper, we propose a DNN model tailored for the HDLSS data, named Deep Neural Pursuit (DNP). DNP selects a subset of high dimensional features for the alleviation of overfitting and takes the average over multiple dropouts to calculate gradients with low variance. As the first DNN method applied on the HDLSS data, DNP enjoys the advantages of the high nonlinearity, the robustness to high dimensionality, the capability of learning from a small number of samples, the stability in feature selection, and the end-to-end training. We demonstrate these advantages of DNP via empirical results on both synthetic and real-world biological datasets.
引用
收藏
页码:2287 / 2293
页数:7
相关论文
共 50 条
  • [21] Applying Deep Generative Neural Networks to Data Augmentation for Consumer Survey Data with a Small Sample Size
    Watanuki, Shinya
    Edo, Katsue
    Miura, Toshihiko
    APPLIED SCIENCES-BASEL, 2024, 14 (19):
  • [22] On some graph-based two-sample tests for high dimension, low sample size data
    Sarkar, Soham
    Biswas, Rahul
    Ghosh, Anil K.
    MACHINE LEARNING, 2020, 109 (02) : 279 - 306
  • [23] On some graph-based two-sample tests for high dimension, low sample size data
    Soham Sarkar
    Rahul Biswas
    Anil K. Ghosh
    Machine Learning, 2020, 109 : 279 - 306
  • [24] High dimension low sample size asymptotics of robust PCA
    Zhou, Yi-Hui
    Marron, J. S.
    ELECTRONIC JOURNAL OF STATISTICS, 2015, 9 (01): : 204 - 218
  • [25] PCA CONSISTENCY IN HIGH DIMENSION, LOW SAMPLE SIZE CONTEXT
    Jung, Sungkyu
    Marron, J. S.
    ANNALS OF STATISTICS, 2009, 37 (6B): : 4104 - 4130
  • [26] THE STATISTICS AND MATHEMATICS OF HIGH DIMENSION LOW SAMPLE SIZE ASYMPTOTICS
    Shen, Dan
    Shen, Haipeng
    Zhu, Hongtu
    Marron, J. S.
    STATISTICA SINICA, 2016, 26 (04) : 1747 - 1770
  • [27] PCA Consistency for Non-Gaussian Data in High Dimension, Low Sample Size Context
    Yata, Kazuyoshi
    Aoshima, Makoto
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2009, 38 (16-17) : 2634 - 2652
  • [28] ON SOME EXACT DISTRIBUTION-FREE ONE-SAMPLE TESTS FOR HIGH DIMENSION LOW SAMPLE SIZE DATA
    Biswas, Munmun
    Mukhopadhyay, Minerva
    Ghosh, Anil K.
    STATISTICA SINICA, 2015, 25 (04) : 1421 - 1435
  • [29] Robust centroid based classification with minimum error rates for high dimension, low sample size data
    Jiang, Jiancheng
    Marron, J. S.
    Jiang, Xuejun
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2009, 139 (08) : 2571 - 2580
  • [30] Fuzzy clustering based classifier for extraction of individualities from high dimension low sample size data
    Sato-Ilic, Mika
    INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2023, 17 (01): : 127 - 138