Active mining of data streams

被引:0
|
作者
Fan, W [1 ]
Huang, YA [1 ]
Wang, HX [1 ]
Yu, PS [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Hawthorne, NY 10532 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most previously proposed mining methods on data streams make an unrealistic assumption that "labelled" data stream is readily available and can be mined at anytime. However, in most real-world problems, labelled data streams are rarely immediately available. Due to this reason, models are refreshed periodically, that is usually synchronized with data availability schedule. There are several undesirable consequences of this "passive periodic refresh". In this paper, we propose a new concept of demand-driven active data mining. It estimates the error of the model on the new data stream without knowing the true class labels. When significantly higher error is suspected, it investigates the true class labels of a selected number of examples in the most recent data stream to verify the suspected higher error.
引用
收藏
页码:457 / 461
页数:5
相关论文
共 50 条
  • [21] On mining time-changing data streams
    Department of Computer Science and Engineering, Southeast University, Nanjing 210018, China
    不详
    Chin J Electron, 2006, 2 (220-224):
  • [22] Mining Patterns From Data Streams: An Overview
    Borah, Anindita
    BhabeshNath
    2017 INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC), 2017, : 371 - 376
  • [23] Processing and mining complex data streams Preface
    Stefanowski, Jerzy
    Cuzzocrea, Alfredo
    Slezak, Dominik
    INFORMATION SCIENCES, 2014, 285 : 63 - 65
  • [24] Mining Recent Frequent Itemsets in Data Streams
    Li, Kun
    Wang, Yong-yan
    Ellahi, Manzoor
    Wang, Hong-an
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 4, PROCEEDINGS, 2008, : 353 - 358
  • [25] Suggested Techniques for Clustering and Mining of Data Streams
    Anuradha, G.
    Roy, Bidisha
    2014 INTERNATIONAL CONFERENCE ON CIRCUITS, SYSTEMS, COMMUNICATION AND INFORMATION TECHNOLOGY APPLICATIONS (CSCITA), 2014, : 265 - 270
  • [26] A general framework for mining massive data streams
    Domingos, P
    Hulten, G
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2003, 12 (04) : 945 - 949
  • [27] The CART decision tree for mining data streams
    Rutkowski, Leszek
    Jaworski, Maciej
    Pietruczuk, Lena
    Duda, Piotr
    INFORMATION SCIENCES, 2014, 266 : 1 - 15
  • [28] Mining Evolving Data Streams with Particle Filters
    Fok, Ricky
    An, Aijun
    Wang, Xiaogang
    COMPUTATIONAL INTELLIGENCE, 2017, 33 (02) : 147 - 180
  • [29] Resource-aware mining of data streams
    Gaber, MM
    Krishnaswamy, S
    Zaslavsky, A
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2005, 11 (08) : 1440 - 1453
  • [30] Effect of Data Repair on Mining Network Streams
    Loh, Ji Meng
    Dasu, Tamraparni
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012), 2012, : 226 - 233