Boosting decision stumps for dynamic feature selection on data streams

被引:23
|
作者
Barddal, Jean Paul [1 ]
Enembreck, Fabricio [1 ]
Gomes, Heitor Murilo [2 ]
Bifet, Albert [2 ]
Pfahringer, Bernhard [3 ]
机构
[1] Pontificia Univ Catolica Parana, Grad Program Informat PPGIa, Curitiba, Parana, Brazil
[2] Univ Paris Saclay, Inst Mines Telecom, Telecom ParisTech, INFRES, Paris, France
[3] Univ Waikato, Dept Comp Sci, Hamilton, New Zealand
关键词
Data stream mining; Feature selection; Concept drift; Feature drift; ONLINE; CLASSIFICATION; MACHINE; DRIFT;
D O I
10.1016/j.is.2019.02.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Feature selection targets the identification of which features of a dataset are relevant to the learning task. It is also widely known and used to improve computation times, reduce computation requirements, and to decrease the impact of the curse of dimensionality and enhancing the generalization rates of classifiers. In data streams, classifiers shall benefit from all the items above, but more importantly, from the fact that the relevant subset of features may drift over time. In this paper, we propose a novel dynamic feature selection method for data streams called Adaptive Boosting for Feature Selection (ABFS). ABFS chains decision stumps and drift detectors, and as a result, identifies which features are relevant to the learning task as the stream progresses with reasonable success. In addition to our proposed algorithm, we bring feature selection-specific metrics from batch learning to streaming scenarios. Next, we evaluate ABFS according to these metrics in both synthetic and real-world scenarios. As a result, ABFS improves the classification rates of different types of learners and eventually enhances computational resources usage. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:13 / 29
页数:17
相关论文
共 50 条
  • [1] Feature Selection with Conjunctions of Decision Stumps and Learning from Microarray Data
    Shah, Mohak
    Marchand, Mario
    Corbeil, Jacques
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (01) : 174 - 186
  • [2] Dynamic Feature Selection for Clustering High Dimensional Data Streams
    Fahy, Conor
    Yang, Shengxiang
    [J]. IEEE ACCESS, 2019, 7 : 127128 - 127140
  • [3] Merit-guided dynamic feature selection filter for data streams
    Barddal, Jean Paul
    Enembreck, Fabricio
    Gomes, Heitor Murilo
    Bifet, Albert
    Pfahringer, Bernhard
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 116 : 227 - 242
  • [4] Feature subset selection for data and feature streams: a review
    Carlos Villa-Blanco
    Concha Bielza
    Pedro Larrañaga
    [J]. Artificial Intelligence Review, 2023, 56 : 1011 - 1062
  • [5] Feature subset selection for data and feature streams: a review
    Villa-Blanco, Carlos
    Bielza, Concha
    Larranaga, Pedro
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (SUPPL 1) : 1011 - 1062
  • [6] Boosting decision stumps to do pairwise classification
    Jun, Xie
    Lu, Yu
    Lei, Zhu
    Hui, Xue
    [J]. ELECTRONICS LETTERS, 2014, 50 (12) : 866 - 867
  • [7] Boosting for Feature Selection for Microarray Data Analysis
    Guile, Geoffrey R.
    Wang, Wenjia
    [J]. 2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 2559 - 2563
  • [8] Selection of decision stumps in bagging ensembles
    Martinez-Munoz, Gonzalo
    Hernandez-Lobato, Daniel
    Suarez, Alberto
    [J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2007, PT 1, PROCEEDINGS, 2007, 4668 : 319 - +
  • [9] Local Boosting of Decision Stumps for Regression and Classification Problems
    Kotsiantis, S. B.
    Kanellopoulos, D.
    Pintelas, P. E.
    [J]. JOURNAL OF COMPUTERS, 2006, 1 (04) : 30 - 37
  • [10] Boosting Feature Selection
    Redpath, DB
    Lebart, K
    [J]. PATTERN RECOGNITION AND DATA MINING, PT 1, PROCEEDINGS, 2005, 3686 : 305 - 314