Group Learning for High-Dimensional Sparse Data

被引:0
|
作者
Cherkassky, Vladimir [1 ,2 ]
Chen, Hsiang-Han [2 ]
Shiao, Han-Tai [1 ]
机构
[1] Univ Minnesota, Dept Elect & Comp Engn, Minneapolis, MN 55455 USA
[2] Univ Minnesota, Bioinformat & Computat Biol, Minneapolis, MN 55455 USA
关键词
binary classification; digit recognition; feature selection; histogram of projections; Group Learning; iEEG; seizure prediction; SVM; unbalanced data;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe new methodology for supervised learning with sparse data, i.e., when the number of input features is (much) larger than the number of training samples (n). Under the proposed approach, all available (d) input features are split into several (t) subsets, effectively resulting in a larger number (t*n) of labeled training samples in lower-dimensional input space (of dimensionality d/t). This (modified) training data is then used to estimate a classifier for making predictions in lower-dimensional space. In this paper, standard SVM is used for training a classifier. During testing (prediction), a group of t predictions made by SVM classifier needs to be combined via intelligent post-processing rules, in order to make a prediction for a test input (in the original d-dimensional space). The novelty of our approach is in the design and empirical validation of these post-processing rules under Group Learning setting. We demonstrate that such post-processing rules effectively reflect general (common-sense) a priori knowledge (about application data). Specifically, we propose two different post-processing schemes and demonstrate their effectiveness for two real-life application domains, i.e., handwritten digit recognition and seizure prediction from iEEG signal. These empirical results show superior performance of the Group Learning approach for sparse data, under both balanced and unbalanced classification settings
引用
收藏
页数:10
相关论文
共 50 条
  • [1] PCA learning for sparse high-dimensional data
    Hoyle, DC
    Rattray, M
    [J]. EUROPHYSICS LETTERS, 2003, 62 (01): : 117 - 123
  • [2] Similarity Learning for High-Dimensional Sparse Data
    Liu, Kuan
    Bellet, Aurelien
    Sha, Fei
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 : 653 - 662
  • [3] Efficient Sparse Representation for Learning With High-Dimensional Data
    Chen, Jie
    Yang, Shengxiang
    Wang, Zhu
    Mao, Hua
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (08) : 4208 - 4222
  • [4] Sparse Learning of the Disease Severity Score for High-Dimensional Data
    Stojkovic, Ivan
    Obradovic, Zoran
    [J]. COMPLEXITY, 2017,
  • [5] On the challenges of learning with inference networks on sparse, high-dimensional data
    Krishnan, Rahul G.
    Liang, Dawen
    Hoffman, Matthew D.
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [6] On the anonymization of sparse high-dimensional data
    Ghinita, Gabriel
    Tao, Yufei
    Kalnis, Panos
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 715 - +
  • [7] An Advanced Group Contribution Method for High-Dimensional, Sparse Data Sets
    Lee, Chang Jun
    Lee, Jong Min
    [J]. MOLECULAR INFORMATICS, 2012, 31 (01) : 41 - 52
  • [8] Interpolation of sparse high-dimensional data
    Thomas C. H. Lux
    Layne T. Watson
    Tyler H. Chang
    Yili Hong
    Kirk Cameron
    [J]. Numerical Algorithms, 2021, 88 : 281 - 313
  • [9] Interpolation of sparse high-dimensional data
    Lux, Thomas C. H.
    Watson, Layne T.
    Chang, Tyler H.
    Hong, Yili
    Cameron, Kirk
    [J]. NUMERICAL ALGORITHMS, 2021, 88 (01) : 281 - 313
  • [10] XDL: An Industrial Deep Learning Framework for High-dimensional Sparse Data
    Jiang, Biye
    Deng, Chao
    Yi, Huimin
    Hu, Zelin
    Zhou, Guorui
    Zheng, Yang
    Huang, Sui
    Guo, Xinyang
    Wang, Dongyue
    Song, Yue
    Zhao, Liqin
    Wang, Zhi
    Sun, Peng
    Zhang, Yu
    Zhang, Di
    Li, Jinhui
    Xu, Jian
    Zhu, Xiaoqiang
    Gai, Kun
    [J]. 1ST INTERNATIONAL WORKSHOP ON DEEP LEARNING PRACTICE FOR HIGH-DIMENSIONAL SPARSE DATA WITH KDD (DLP-KDD 2019), 2019,