REC: fast sparse regression-based multicategory classification

被引:1
|
作者
Zhang, Chong [1 ]
Lu, Xiaoling [2 ]
Zhu, Zhengyuan [3 ]
Hu, Yin [4 ]
Singh, Darshan [5 ]
Jones, Corbin [6 ]
Liu, Jinze [7 ]
Prins, Jan F. [5 ]
Liu, Yufeng [8 ]
机构
[1] Univ Waterloo, Dept Stat & Actuarial Sci, Waterloo, ON N2L 3G1, Canada
[2] Renmin Univ China, Ctr Appl Stat, Sch Stat, Beijing, Peoples R China
[3] Iowa State Univ, Dept Stat, Ames, IA USA
[4] Sage Bionetworks, Seattle, WA USA
[5] Univ North Carolina Chapel Hill, Dept Comp Sci, Chapel Hill, NC USA
[6] Univ North Carolina Chapel Hill, Dept Biol, Chapel Hill, NC USA
[7] Univ Kentucky, Dept Comp Sci, Lexington, KY 40506 USA
[8] Univ North Carolina Chapel Hill, Dept Stat & Operat Res, UNC Lineberger Comprehens Canc Ctr, Dept Genet,Dept Biostat,Carolina Ctr Genome Sci, Chapel Hill, NC USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
LASSO; Parallel computing; Probability estimation; Simplex; Variable selection; SUPPORT VECTOR MACHINES; TUMOR CLASSIFICATION; LOGISTIC-REGRESSION; VARIABLE SELECTION; SHRINKAGE;
D O I
10.4310/SII.2017.v10.n2.a2
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent advance in technology enables researchers to gather and store enormous data sets with ultra high dimensionality. In bioinformatics, microarray and next generation sequencing technologies can produce data with tens of thousands of predictors of biomarkers. On the other hand, the corresponding sample sizes are often limited. For classification problems, to predict new observations with high accuracy, and to better understand the effect of predictors on classification, it is desirable, and often necessary, to train the classifier with variable selection. In the literature, sparse regularized classification techniques have been popular due to the ability of simultaneous classification and variable selection. Despite its success, such a sparse penalized method may have low computational speed, when the dimension of the problem is ultra high. To overcome this challenge, we propose a new sparse REgression based multicategory Classifier (REC). Our method uses a simplex to represent different categories of the classification problem. A major advantage of REC is that the optimization can be decoupled into smaller independent sparse penalized regression problems, and hence solved by using parallel computing. Consequently, REC enjoys an extraordinarily fast computational speed. Moreover, REC is able to provide class conditional probability estimation. Simulated examples and applications on microarray and next generation sequencing data suggest that REC is very competitive when compared to several existing methods.
引用
收藏
页码:175 / 185
页数:11
相关论文
共 50 条
  • [1] Fast sparse regression and classification
    Friedman, Jerome H.
    [J]. INTERNATIONAL JOURNAL OF FORECASTING, 2012, 28 (03) : 722 - 738
  • [2] PERFORMANCE GUARANTEES FOR SPARSE REGRESSION-BASED UNMIXING
    Itoh, Yuki
    Duarte, Marco F.
    Parente, Mario
    [J]. 2015 7TH WORKSHOP ON HYPERSPECTRAL IMAGE AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS), 2015,
  • [3] SPARSE REGRESSION-BASED MULTIPLE SEQUENCE ALIGNMENT
    Tung Doan
    Atsuhiro, Takasu
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1372 - 1377
  • [4] A REGRESSION-BASED LINEAR CLASSIFICATION PROCEDURE
    LAMOTTE, LR
    MCWHORTER, A
    [J]. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1981, 41 (02) : 341 - 347
  • [5] Regression-based Sparse Coding for Facial Point Detection
    Tan, Shuqiu
    Guo, Chenggang
    Chen, Dongyi
    Huang, Zhiqi
    [J]. 2016 INTERNATIONAL CONFERENCE ON VIRTUAL REALITY AND VISUALIZATION (ICVRV 2016), 2016, : 179 - 182
  • [6] Double Regression-Based Sparse Unmixing for Hyperspectral Images
    Zhang, Shuaiyang
    Hua, Wenshen
    Li, Gang
    Liu, Jie
    Huang, Fuyu
    Wang, Qianghui
    [J]. JOURNAL OF SENSORS, 2021, 2021
  • [7] Ultrasonic Classification of Multicategory Thyroid Nodules Based on Logistic Regression
    Zheng, Yi
    Xu, Shangyan
    Zheng, Zhan
    Wu, Lili
    Chen, Lin
    Zhan, Weiwei
    [J]. ULTRASOUND QUARTERLY, 2020, 36 (02) : 146 - 157
  • [8] Matrix Regression-Based Classification for Face Recognition
    Mi, Jian-Xun
    Zhu, Quanwei
    Luo, Zhiheng
    [J]. STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2018, 2018, 11004 : 357 - 366
  • [9] Hyperspectral Unmixing: Geometrical, Statistical, and Sparse Regression-Based Approaches
    Bioucas-Dias, Jose M.
    Plaza, Antonio
    [J]. IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING XVI, 2010, 7830
  • [10] Comment on 'Fast sparse regression and classification' by JH Friedman
    Kapetanios, George
    Pesaran, M. Hashem
    [J]. INTERNATIONAL JOURNAL OF FORECASTING, 2012, 28 (03) : 739 - 740