Interpretable SAM-kNN Regressor for Incremental Learning on High-Dimensional Data Streams

被引:0
|
作者
Jakob, Jonathan [1 ]
Artelt, Andre [1 ,2 ]
Hasenjaeger, Martina [3 ]
Hammer, Barbara [1 ]
机构
[1] Bielefeld Univ, Tech Fac, Bielefeld, Germany
[2] Univ Cyprus, Dept Comp Sci, Nicosia, Cyprus
[3] Honda Res Inst, Learning & Personalizat Grp, Offenbach, Germany
基金
欧洲研究理事会;
关键词
CONCEPT DRIFT; MODEL;
D O I
10.1080/08839514.2023.2198846
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many real-world scenarios, data are provided as a potentially infinite stream of samples that are subject to changes in the underlying data distribution, a phenomenon often referred to as concept drift. A specific facet of concept drift is feature drift, where the relevance of a feature to the problem at hand changes over time. High-dimensionality of the data poses an additional challenge to learning algorithms operating in such environments. Common scenarios of this nature can for example be found in sensor-based maintenance operations of industrial machines or inside entire networks, such as power grids or water distribution systems. However, since most existing methods for incremental learning focus on classification tasks, efficient online learning for regression is still an underdeveloped area. In this work, we introduce an extension to the SAM-kNN Regressor that incorporates metric learning in order to improve the prediction quality on data streams, gain insights into the relevance of different input features and based on that, transform the input data into a lower dimension in order to improve computational complexity and suitability for high-dimensional data. We evaluate our proposed method on artificial data, to demonstrate its applicability in various scenarios. In addition to that, we apply the method to the real-world problem of water distribution network monitoring. Specifically, we demonstrate that sensor faults in the water distribution network can be detected by monitoring the feature relevances computed by our algorithm.
引用
收藏
页数:30
相关论文
共 50 条
  • [41] A fully interpretable stacking fuzzy classifier with stochastic configuration-based learning for high-dimensional data
    Li, Yuchen
    Chung, Fu-lai
    Wang, Shitong
    [J]. INFORMATION SCIENCES, 2025, 686
  • [42] Efficient Learning on High-dimensional Operational Data
    Samani, Forough Shahab
    Zhang, Hongyi
    Stadler, Rolf
    [J]. 2019 15TH INTERNATIONAL CONFERENCE ON NETWORK AND SERVICE MANAGEMENT (CNSM), 2019,
  • [43] PCA learning for sparse high-dimensional data
    Hoyle, DC
    Rattray, M
    [J]. EUROPHYSICS LETTERS, 2003, 62 (01): : 117 - 123
  • [44] Metric Learning for High-Dimensional Tensor Data
    Shi Jiarong
    Jiao Licheng
    Shang Fanhua
    [J]. CHINESE JOURNAL OF ELECTRONICS, 2011, 20 (03) : 495 - 498
  • [45] Similarity Learning for High-Dimensional Sparse Data
    Liu, Kuan
    Bellet, Aurelien
    Sha, Fei
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 : 653 - 662
  • [46] Group Learning for High-Dimensional Sparse Data
    Cherkassky, Vladimir
    Chen, Hsiang-Han
    Shiao, Han-Tai
    [J]. 2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [47] Efficient index-based KNN join processing for high-dimensional data
    Yu, Cui
    Cui, Bin
    Wang, Shuguang
    Su, Jianwen
    [J]. INFORMATION AND SOFTWARE TECHNOLOGY, 2007, 49 (04) : 332 - 344
  • [48] Monitoring of high-dimensional and high-frequency data streams: A nonparametric approach
    Wang, Zhiqiong
    Li, Xin
    Wang, Ying
    Ma, Yanhui
    Xue, Li
    [J]. QUALITY TECHNOLOGY AND QUANTITATIVE MANAGEMENT, 2024,
  • [49] Interpretable machine learning for demand modeling with high-dimensional data using Gradient Boosting Machines and Shapley values
    Antipov, Evgeny A.
    Pokryshevskaya, Elena B.
    [J]. JOURNAL OF REVENUE AND PRICING MANAGEMENT, 2020, 19 (05) : 355 - 364
  • [50] A Decision-Theory Approach to Interpretable Set Analysis for High-Dimensional Data
    Boca, Simina M.
    Bravo, Hector Ceorrada
    Caffo, Brian
    Leek, Jeffrey T.
    Parmigiani, Giovanni
    [J]. BIOMETRICS, 2013, 69 (03) : 614 - 623