Concept Neurons - Handling Drift Issues for Real-Time Industrial Data Mining

被引:12
|
作者
Moreira-Matias, Luis [1 ]
Gama, Joao [2 ,4 ]
Mendes-Moreira, Joao [2 ,3 ]
机构
[1] NEC Labs Europe, Kurfursten Anlage 36, D-69115 Heidelberg, Germany
[2] Univ Porto, Fac Econ, P-4200465 Oporto, Portugal
[3] LIAAD INESC TEC, P-4200465 Oporto, Portugal
[4] Univ Porto, DEI FEUP, P-4200465 Oporto, Portugal
关键词
Supervised learning; Online learning; Concept drift; Perceptron; Stochastic gradient descent; Regression; Residuals; Transportation;
D O I
10.1007/978-3-319-46131-1_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning from data streams is a challenge faced by data science professionals from multiple industries. Most of them struggle hardly on applying traditional Machine Learning algorithms to solve these problems. It happens so due to their high availability on ready-to-use software libraries on big data technologies (e.g. SparkML). Nevertheless, most of them cannot cope with the key characteristics of this type of data such as high arrival rate and/or non-stationary distributions. In this paper, we introduce a generic and yet simplistic framework to fill this gap denominated Concept Neurons. It leverages on a combination of continuous inspection schemas and residual-based updates over the model parameters and/or the model output. Such framework can empower the resistance of most of induction learning algorithms to concept drifts. Two distinct and hence closely related flavors are introduced to handle different drift types. Experimental results on successful distinct applications on different domains along transportation industry are presented to uncover the hidden potential of this methodology.
引用
收藏
页码:96 / 111
页数:16
相关论文
共 50 条
  • [31] RAPID CAMAC DATA HANDLING IN FORTRAN REAL-TIME PROGRAMS
    FEENSTRA, R
    JOHNSON, RR
    WINTER, C
    NUCLEAR INSTRUMENTS & METHODS, 1979, 160 (03): : 511 - 518
  • [32] Outlier mining in real-time measurement data of sensor based on data mining technique
    Lei, Lin
    Wang, Houjun
    ISTM/2007: 7TH INTERNATIONAL SYMPOSIUM ON TEST AND MEASUREMENT, VOLS 1-7, CONFERENCE PROCEEDINGS, 2007, : 3437 - 3440
  • [33] Efficient Handling of Concept Drift and Concept Evolution over Stream Data
    Haque, Ahsanul
    Khan, Latifur
    Baron, Michael
    Thuraisingham, Bhavani
    Aggarwal, Charu
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 481 - 492
  • [34] Research on real-time data communication for industrial Ethernet
    Xi, B
    Fang, YJ
    ICEMI 2005: Conference Proceedings of the Seventh International Conference on Electronic Measurement & Instruments, Vol 4, 2005, : 133 - 137
  • [35] Research on real-time data communication for industrial Ethernet
    Xi Bo
    Fang Yan-jun
    Proceedings of 2005 Chinese Control and Decision Conference, Vols 1 and 2, 2005, : 1377 - 1380
  • [36] Adaptive Reservoir Neural Gas: An Effective Clustering Algorithm for Addressing Concept Drift in Real-Time Data Streams
    Demertzis, Konstantinos
    Iliadis, Lazaros
    Papaleonidas, Antonios
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI, 2023, 14259 : 152 - 166
  • [37] Research on real-time network data mining technology for big data
    Jing Hu
    Xianbin Xu
    EURASIP Journal on Wireless Communications and Networking, 2019
  • [38] Strict real-time for deformable materials On handling deformable objects with industrial robots
    Hinze, Christoph
    Wnuk, Markus
    Lechler, Armin
    Verl, Alexander
    ATP MAGAZINE, 2019, (11-12): : 112 - 119
  • [39] Real Time Data Stream Classification and Adapting To Various Concept Drift Scenarios
    Dongre, Priyanka B.
    Malik, Latesh G.
    SOUVENIR OF THE 2014 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2014, : 533 - 537
  • [40] A real-time filtering method for the drift data of a laser Doppler velocimeter
    Wang, Qi
    Gao, Chunfeng
    Zhou, Jian
    Nie, Xiaoming
    Long, Xingwu
    JOURNAL OF MODERN OPTICS, 2019, 66 (04) : 413 - 418