Concept Neurons - Handling Drift Issues for Real-Time Industrial Data Mining

被引:12
|
作者
Moreira-Matias, Luis [1 ]
Gama, Joao [2 ,4 ]
Mendes-Moreira, Joao [2 ,3 ]
机构
[1] NEC Labs Europe, Kurfursten Anlage 36, D-69115 Heidelberg, Germany
[2] Univ Porto, Fac Econ, P-4200465 Oporto, Portugal
[3] LIAAD INESC TEC, P-4200465 Oporto, Portugal
[4] Univ Porto, DEI FEUP, P-4200465 Oporto, Portugal
关键词
Supervised learning; Online learning; Concept drift; Perceptron; Stochastic gradient descent; Regression; Residuals; Transportation;
D O I
10.1007/978-3-319-46131-1_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning from data streams is a challenge faced by data science professionals from multiple industries. Most of them struggle hardly on applying traditional Machine Learning algorithms to solve these problems. It happens so due to their high availability on ready-to-use software libraries on big data technologies (e.g. SparkML). Nevertheless, most of them cannot cope with the key characteristics of this type of data such as high arrival rate and/or non-stationary distributions. In this paper, we introduce a generic and yet simplistic framework to fill this gap denominated Concept Neurons. It leverages on a combination of continuous inspection schemas and residual-based updates over the model parameters and/or the model output. Such framework can empower the resistance of most of induction learning algorithms to concept drifts. Two distinct and hence closely related flavors are introduced to handle different drift types. Experimental results on successful distinct applications on different domains along transportation industry are presented to uncover the hidden potential of this methodology.
引用
收藏
页码:96 / 111
页数:16
相关论文
共 50 条
  • [41] REAL-TIME CLASSIFICATION OF IDS ALERTS WITH DATA MINING TECHNIQUES
    Vaarandi, Risto
    MILCOM 2009 - 2009 IEEE MILITARY COMMUNICATIONS CONFERENCE, VOLS 1-4, 2009, : 1786 - 1792
  • [42] Insider attack and real-time data mining of user behavior
    Anderson, G. F.
    Selby, D. A.
    Ramsey, M.
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2007, 51 (3-4) : 465 - 475
  • [43] Massive Real-Time Data Mining Algorithm for a Multimedia Database
    Gong, Jiaju
    Wu, Qin
    Engineering Intelligent Systems, 2022, 30 (01): : 35 - 37
  • [44] Real-time reservoir operation using data mining techniques
    Omid Bozorg-Haddad
    Mahyar Aboutalebi
    Parisa-Sadat Ashofteh
    Hugo A. Loáiciga
    Environmental Monitoring and Assessment, 2018, 190
  • [45] Real-Time Bigdata Analytics: A Stream Data Mining Approach
    Tidke, Bharat
    Mehta, Rupa G.
    Dhanani, Jenish
    RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 2, 2018, 708 : 345 - 351
  • [46] Research on real-time networkdata mining technology for big data
    Hu, Jing
    Xu, Xianbin
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2019, 2019 (1)
  • [47] RAPID: Real-time Analytics Platform for Interactive Data Mining
    Lim, Kwan Hui
    Jayasekara, Sachini
    Karunasekera, Shanika
    Harwood, Aaron
    Falzon, Lucia
    Dunn, John
    Burgess, Glenn
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT III, 2019, 11053 : 649 - 653
  • [48] Real-time stream data mining based on CanTree and Gtree
    Kim, Jaein
    Hwang, Buhyun
    INFORMATION SCIENCES, 2016, 367 : 512 - 528
  • [49] New mechanism about real-time multimedia data mining
    Zeng, Cheng
    Cao, Jia-Heng
    Huang, Min
    2003, Wuhan University (49):
  • [50] Real-time reservoir operation using data mining techniques
    Bozorg-Haddad, Omid
    Aboutalebi, Mahyar
    Ashofteh, Parisa-Sadat
    Loaiciga, Hugo A.
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 2018, 190 (10)