Boosting RVM classifiers for large data sets

被引:0
|
作者
Silva, Catarina [1 ,2 ]
Ribeir, Bernardete [2 ]
Sung, Andrew H. [3 ]
机构
[1] Pol Inst Leira, Sch Technol & Management, Leira, Portugal
[2] Univ Coimbra, Dept Informat Engn, Ctr Informat & Syst, P-3000 Coimbra, Portugal
[3] New Mexico Inst Min & Technol, Inst Comp Addit Sys Anal, Dept Comp Sci, Socorro, NM USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Relevance Vector Machines (RVM) extend Support Vector Machines (SVM) to have probabilistic interpretations, to build sparse training models with fewer basis functions (i.e., relevance vectors or prototypes), and to realize Bayesian learning by placing priors over parameters (i.e., introducing hyperparameters). However, RVM algorithms do not scale up to large data sets. To overcome this problem, in this paper we propose a RVM boosting algorithm and demonstrate its potential with a text mining application. The idea is to build weaker classifiers, and then improve overall accuracy by using a boosting technique for document classification. The algorithm proposed is able to incorporate all the training data available; when combined with sampling techniques for choosing the working set, the boosted learning machine is able to attain high accuracy. Experiments on REUTERS benchmark show that the results achieve competitive accuracy against state-of-the-art SVM; meanwhile, the sparser solution found allows real-time implementations.
引用
收藏
页码:228 / +
页数:2
相关论文
共 50 条
  • [31] The convergence of linear classifiers on large sparse data
    Li, Xiang
    Wang, Huaimin
    Gu, Bin
    Ling, Charles X.
    [J]. NEUROCOMPUTING, 2018, 273 : 622 - 633
  • [32] A Depository for Large Data Sets
    Burns, J. A.
    [J]. Physics of the Earth and Planetary Interiors, 1994, 864
  • [33] VISUALIZING LARGE DATA SETS
    HIBBARD, WL
    SANTEK, DA
    [J]. INTERACTIVE INFORMATION AND PROCESSING SYSTEMS FOR METEOROLOGY, OCEANOGRAPHY AND HYDROLOGY, 1988, : 172 - 174
  • [34] DEALING WITH LARGE DATA SETS
    GRAEFE, JF
    WOOD, RW
    [J]. NEUROTOXICOLOGY AND TERATOLOGY, 1990, 12 (05) : 449 - 454
  • [35] A Depository for Large Data Sets
    Burns, J. A.
    [J]. Icarus International Journal of Solar System Studies, 1995, 113 (01):
  • [36] THE CHALLENGE OF LARGE DATA SETS
    ENNIS, M
    [J]. SOUTH AFRICAN STATISTICAL JOURNAL, 1987, 21 (02) : 182 - 182
  • [37] Visualizing large data sets
    不详
    [J]. R&D MAGAZINE, 1998, 40 (01): : 73 - 73
  • [38] An Empirical Comparison of Bagging, Boosting and Support Vector Machine Classifiers in Data Mining
    Lee, Yung-Seop
    Oh, Hyun-Joung
    Kim, Mee-Kyung
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2005, 18 (02) : 343 - 354
  • [39] FUZZ-EQ: A data equalizer for boosting the discrimination power of fuzzy classifiers
    Uriz, Mikel
    Elkano, Mikel
    Bustince, Humberto
    Galar, Mikel
    [J]. APPLIED SOFT COMPUTING, 2020, 93
  • [40] Classification Boosting by Data Decomposition Using Consensus-Based Combination of Classifiers
    Tayanov, Vitaliy
    Krzyzak, Adam
    Suen, Ching
    [J]. IMAGE ANALYSIS AND RECOGNITION, ICIAR 2017, 2017, 10317 : 408 - 415