A Novel Sampling Strategy for Active Learning over Evolving Stream Data

被引:0
|
作者
Zhang, Xuxu [1 ]
Cao, Zhi [1 ]
Peng, Li [1 ]
Ren, Siqi [1 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410000, Hunan, Peoples R China
关键词
Active learning; Data streams; Evidence; random strategy;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In classification tasks, data labeling is an expensive and time-consuming process, hence, active learning which query labels for a small representative portion of data, is becoming increasingly important. However, few works consider the challenges from data steam setting because most of the active learning method is designed for non-streaming setting. Be based upon the status quo, after synthesizing the evidence-based uncertainty sampling strategy and split sampling strategy above, we propose a new sampling strategy for active learning over evolving stream data, which can take full advantages of the strengths of each. First, the original data stream is randomly divided into two substreams. Instances from one sub-stream are labeled according to the high evidence-focused uncertainty strategy, while instances from the other sub-stream are marked by the random strategy for detecting true concept drifts. Second, we introduce a sliding window in the high evidence-focused uncertainty strategy, finding out whether an instance is the conflict-uncertainty instance or not. Clearly, our strategy solves the issue of the effective use of evidence in data streams setting, and can choose more representative instances over evolving data streams for training a model. Finally, in experiments over four benchmark datasets, compared with state-of-art active learning strategies, the result illustrates good predictive performance of our proposed approach.
引用
收藏
页码:348 / 354
页数:7
相关论文
共 50 条
  • [1] Online Active Learning in Data Stream Regression Using Uncertainty Sampling Based on Evolving Generalized Fuzzy Models
    Lughofer, Edwin
    Pratama, Mahardhika
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2018, 26 (01) : 292 - 309
  • [2] Active AODE learning based on a novel sampling strategy and its application
    Wu, Jia
    Cai, Zhi-hua
    Chen, Xiao-lin
    Ao, Shuang
    [J]. INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2013, 47 (04) : 326 - 333
  • [3] Active Learning with Evolving Streaming Data
    Zliobaite, Indre
    Bifet, Albert
    Pfahringer, Bernhard
    Holmes, Geoff
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT III, 2011, 6913 : 597 - 612
  • [4] Active Learning over Evolving Data Streams using Paired Ensemble Framework
    Xu, Wenhua
    Zhao, Fengfei
    Lu, Zhengcai
    [J]. 2016 EIGHTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2016, : 180 - 185
  • [5] An overview on evolving systems and learning from stream data
    Daniel Leite
    Igor Škrjanc
    Fernando Gomide
    [J]. Evolving Systems, 2020, 11 : 181 - 198
  • [6] An overview on evolving systems and learning from stream data
    Leite, Daniel
    Skrjanc, Igor
    Gomide, Fernando
    [J]. EVOLVING SYSTEMS, 2020, 11 (02) : 181 - 198
  • [7] Grid-based clustering over an evolving data stream
    Wan, Renxia
    Chen, Jingchao
    Wang, Lixin
    Su, Xiaoke
    [J]. INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2009, 1 (04) : 393 - 410
  • [8] HClustream: A novel approach for clustering evolving heterogeneous data stream
    Yang, Chunyu
    Zhou, Jie
    [J]. ICDM 2006: SIXTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, WORKSHOPS, 2006, : 682 - +
  • [9] Clustering Based on Correlation Fractal Dimension Over an Evolving Data Stream
    Yarlagadda, Anuradha
    Jonnalagedda, Murthy
    Munaga, Krishna
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2018, 15 (01) : 1 - 9
  • [10] Density-Based Clustering over an Evolving Data Stream with Noise
    Cao, Feng
    Ester, Martin
    Qian, Weining
    Zhou, Aoying
    [J]. PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 328 - +