Boosting RVM classifiers for large data sets

被引:0
|
作者
Silva, Catarina [1 ,2 ]
Ribeir, Bernardete [2 ]
Sung, Andrew H. [3 ]
机构
[1] Pol Inst Leira, Sch Technol & Management, Leira, Portugal
[2] Univ Coimbra, Dept Informat Engn, Ctr Informat & Syst, P-3000 Coimbra, Portugal
[3] New Mexico Inst Min & Technol, Inst Comp Addit Sys Anal, Dept Comp Sci, Socorro, NM USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Relevance Vector Machines (RVM) extend Support Vector Machines (SVM) to have probabilistic interpretations, to build sparse training models with fewer basis functions (i.e., relevance vectors or prototypes), and to realize Bayesian learning by placing priors over parameters (i.e., introducing hyperparameters). However, RVM algorithms do not scale up to large data sets. To overcome this problem, in this paper we propose a RVM boosting algorithm and demonstrate its potential with a text mining application. The idea is to build weaker classifiers, and then improve overall accuracy by using a boosting technique for document classification. The algorithm proposed is able to incorporate all the training data available; when combined with sampling techniques for choosing the working set, the boosted learning machine is able to attain high accuracy. Experiments on REUTERS benchmark show that the results achieve competitive accuracy against state-of-the-art SVM; meanwhile, the sparser solution found allows real-time implementations.
引用
收藏
页码:228 / +
页数:2
相关论文
共 50 条
  • [1] Correlations of random classifiers on large data sets
    Kurkova, Vera
    Sanguineti, Marcello
    [J]. SOFT COMPUTING, 2021, 25 (19) : 12641 - 12648
  • [2] Correlations of random classifiers on large data sets
    Věra Kůrková
    Marcello Sanguineti
    [J]. Soft Computing, 2021, 25 : 12641 - 12648
  • [3] The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets
    Gonzalez-Recio, O.
    Jimenez-Montero, J. A.
    Alenda, R.
    [J]. JOURNAL OF DAIRY SCIENCE, 2013, 96 (01) : 614 - 624
  • [4] Boosting First-Order Clauses for Large, Skewed Data Sets
    Oliphant, Louis
    Burnside, Elizabeth
    Shavlik, Jude
    [J]. INDUCTIVE LOGIC PROGRAMMING, 2010, 5989 : 166 - +
  • [5] Incremental Learning of Fuzzy Rule-Based Classifiers for Large Data Sets
    Nakashima, Tomoharu
    Sumitani, Takeshima
    Bargiela, Andrzej
    [J]. 2012 WORLD AUTOMATION CONGRESS (WAC), 2012,
  • [6] The classification of imbalanced large data sets based on MapReduce and ensemble of ELM classifiers
    Junhai Zhai
    Sufang Zhang
    Chenxi Wang
    [J]. International Journal of Machine Learning and Cybernetics, 2017, 8 : 1009 - 1017
  • [7] The classification of imbalanced large data sets based on MapReduce and ensemble of ELM classifiers
    Zhai, Junhai
    Zhang, Sufang
    Wang, Chenxi
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2017, 8 (03) : 1009 - 1017
  • [8] Characterization of Large Target Sets with Probabilistic Classifiers
    Chen, Ray-Ming
    [J]. INTELLIGENT SYSTEMS'2014, VOL 1: MATHEMATICAL FOUNDATIONS, THEORY, ANALYSES, 2015, 322 : 791 - 800
  • [9] Automated learning of RVM for large scale text sets: Divide to conquer
    Silva, Catarina
    Ribeiro, Bernardete
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2006, PROCEEDINGS, 2006, 4224 : 878 - 886
  • [10] Evaluation of the Classifiers in Multiparameter and Imbalanced Data Sets
    Piotrowska, Ewelina
    [J]. INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, ISAT 2019, PT II, 2020, 1051 : 263 - 273