Interpretable SAM-kNN Regressor for Incremental Learning on High-Dimensional Data Streams

被引:0
|
作者
Jakob, Jonathan [1 ]
Artelt, Andre [1 ,2 ]
Hasenjaeger, Martina [3 ]
Hammer, Barbara [1 ]
机构
[1] Bielefeld Univ, Tech Fac, Bielefeld, Germany
[2] Univ Cyprus, Dept Comp Sci, Nicosia, Cyprus
[3] Honda Res Inst, Learning & Personalizat Grp, Offenbach, Germany
基金
欧洲研究理事会;
关键词
CONCEPT DRIFT; MODEL;
D O I
10.1080/08839514.2023.2198846
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many real-world scenarios, data are provided as a potentially infinite stream of samples that are subject to changes in the underlying data distribution, a phenomenon often referred to as concept drift. A specific facet of concept drift is feature drift, where the relevance of a feature to the problem at hand changes over time. High-dimensionality of the data poses an additional challenge to learning algorithms operating in such environments. Common scenarios of this nature can for example be found in sensor-based maintenance operations of industrial machines or inside entire networks, such as power grids or water distribution systems. However, since most existing methods for incremental learning focus on classification tasks, efficient online learning for regression is still an underdeveloped area. In this work, we introduce an extension to the SAM-kNN Regressor that incorporates metric learning in order to improve the prediction quality on data streams, gain insights into the relevance of different input features and based on that, transform the input data into a lower dimension in order to improve computational complexity and suitability for high-dimensional data. We evaluate our proposed method on artificial data, to demonstrate its applicability in various scenarios. In addition to that, we apply the method to the real-world problem of water distribution network monitoring. Specifically, we demonstrate that sensor faults in the water distribution network can be detected by monitoring the feature relevances computed by our algorithm.
引用
收藏
页数:30
相关论文
共 50 条
  • [11] Efficient kNN Join over Dynamic High-Dimensional Data
    Ukey, Nimish
    Yang, Zhengyi
    Zhang, Guangjian
    Liu, Boge
    Li, Binghao
    Zhang, Wenjie
    [J]. DATABASES THEORY AND APPLICATIONS (ADC 2022), 2022, 13459 : 63 - 75
  • [12] Adaptive quantization of the high-dimensional data for efficient KNN processing
    Cui, B
    Hu, J
    Shen, HT
    Yu, C
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2004, 2973 : 302 - 313
  • [13] kNN Join for Dynamic High-Dimensional Data: A Parallel Approach
    Ukey, Nimish
    Yang, Zhengyi
    Yang, Wenke
    Li, Binghao
    Li, Runze
    [J]. DATABASES THEORY AND APPLICATIONS, ADC 2023, 2024, 14386 : 3 - 16
  • [14] Linearization approach for efficient KNN search of high-dimensional data
    Al Aghbari, Z
    Makinouchi, A
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 229 - 238
  • [15] INTERPRETABLE MACHINE LEARNING OF HIGH-DIMENSIONAL AGING HEALTH TRAJECTORIES
    Farrell, Spencer
    Mitnitski, Arnold
    Rockwood, Kenneth
    Rutenberg, Andrew
    [J]. INNOVATION IN AGING, 2021, 5 : 672 - 672
  • [16] Biologically inspired incremental learning for high-dimensional spaces
    Gepperth, Alexander
    Hecht, Thomas
    Lefort, Mathieu
    Koerner, Ursula
    [J]. 5TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING AND ON EPIGENETIC ROBOTICS (ICDL-EPIROB), 2015, : 269 - 275
  • [17] Interpretable machine learning for high-dimensional trajectories of aging health
    Farrell, Spencer
    Mitnitski, Arnold
    Rockwood, Kenneth
    Rutenberg, Andrew
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (01)
  • [18] Producing accurate interpretable clusters from high-dimensional data
    Greene, D
    Cunningham, P
    [J]. KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2005, 2005, 3721 : 486 - 494
  • [19] Online Pattern Mining for High-Dimensional Data Streams
    Yamamoto, Yoshitaka
    Iwanuma, Koji
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 2880 - 2882
  • [20] Detecting Projected Outliers in High-Dimensional Data Streams
    Zhang, Ji
    Gao, Qigang
    Wang, Hai
    Liu, Qing
    Xu, Kai
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2009, 5690 : 629 - +