Group Learning for High-Dimensional Sparse Data

被引:0
|
作者
Cherkassky, Vladimir [1 ,2 ]
Chen, Hsiang-Han [2 ]
Shiao, Han-Tai [1 ]
机构
[1] Univ Minnesota, Dept Elect & Comp Engn, Minneapolis, MN 55455 USA
[2] Univ Minnesota, Bioinformat & Computat Biol, Minneapolis, MN 55455 USA
关键词
binary classification; digit recognition; feature selection; histogram of projections; Group Learning; iEEG; seizure prediction; SVM; unbalanced data;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe new methodology for supervised learning with sparse data, i.e., when the number of input features is (much) larger than the number of training samples (n). Under the proposed approach, all available (d) input features are split into several (t) subsets, effectively resulting in a larger number (t*n) of labeled training samples in lower-dimensional input space (of dimensionality d/t). This (modified) training data is then used to estimate a classifier for making predictions in lower-dimensional space. In this paper, standard SVM is used for training a classifier. During testing (prediction), a group of t predictions made by SVM classifier needs to be combined via intelligent post-processing rules, in order to make a prediction for a test input (in the original d-dimensional space). The novelty of our approach is in the design and empirical validation of these post-processing rules under Group Learning setting. We demonstrate that such post-processing rules effectively reflect general (common-sense) a priori knowledge (about application data). Specifically, we propose two different post-processing schemes and demonstrate their effectiveness for two real-life application domains, i.e., handwritten digit recognition and seizure prediction from iEEG signal. These empirical results show superior performance of the Group Learning approach for sparse data, under both balanced and unbalanced classification settings
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Ensemble of sparse classifiers for high-dimensional biological data
    Kim, Sunghan
    Scalzo, Fabien
    Telesca, Donatello
    Hu, Xiao
    [J]. INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 12 (02) : 167 - 183
  • [22] Subspace Clustering of Very Sparse High-Dimensional Data
    Peng, Hankui
    Pavlidis, Nicos
    Eckley, Idris
    Tsalamanis, Ioannis
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 3780 - 3783
  • [23] Learning to visualise high-dimensional data
    Ahmad, K
    Vrusias, B
    [J]. EIGHTH INTERNATIONAL CONFERENCE ON INFORMATION VISUALISATION, PROCEEDINGS, 2004, : 507 - 512
  • [24] Learning high-dimensional multimedia data
    Zhu, Xiaofeng
    Jin, Zhi
    Ji, Rongrong
    [J]. MULTIMEDIA SYSTEMS, 2017, 23 (03) : 281 - 283
  • [25] International Workshop on Deep Learning Practice for High-Dimensional Sparse Data with RecSys 2023
    Tang, Ruiming
    Zhu, Xiaoqiang
    Ge, Junfeng
    Lee, Kuang-chih
    Jiang, Biye
    Wang, Xingxing
    Zhu, Han
    Tao, Zhuang
    Liu, Weiwen
    Kan, Ren
    Zhang, Weinan
    Zhao, Xiangyu
    [J]. PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023, 2023, : 1276 - 1280
  • [26] A method for learning a sparse classifier in the presence of missing data for high-dimensional biological datasets
    Severson, Kristen A.
    Monian, Brinda
    Love, J. Christopher
    Braatz, Richard D.
    [J]. BIOINFORMATICS, 2017, 33 (18) : 2897 - 2905
  • [27] TESTING FOR GROUP STRUCTURE IN HIGH-DIMENSIONAL DATA
    McLachlan, G. J.
    Rathnayake, Suren I.
    [J]. JOURNAL OF BIOPHARMACEUTICAL STATISTICS, 2011, 21 (06) : 1113 - 1125
  • [28] Categorical Data Analysis for High-Dimensional Sparse Gene Expression Data
    Dousti Mousavi, Niloufar
    Aldirawi, Hani
    Yang, Jie
    [J]. BIOTECH, 2023, 12 (03):
  • [29] High-dimensional sparse MANOVA
    Cai, T. Tony
    Xia, Yin
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2014, 131 : 174 - 196
  • [30] HIGH-DIMENSIONAL SPARSE BAYESIAN LEARNING WITHOUT COVARIANCE MATRICES
    Lin, Alexander
    Song, Andrew H.
    Bilgic, Berkin
    Ba, Demba
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1511 - 1515