Interpretable SAM-kNN Regressor for Incremental Learning on High-Dimensional Data Streams

被引：0

作者：

Jakob, Jonathan ^{[1
]}

Artelt, Andre ^{[1
,2
]}

Hasenjaeger, Martina ^{[3
]}

Hammer, Barbara ^{[1
]}

机构：

[1] Bielefeld Univ, Tech Fac, Bielefeld, Germany

[2] Univ Cyprus, Dept Comp Sci, Nicosia, Cyprus

[3] Honda Res Inst, Learning & Personalizat Grp, Offenbach, Germany

来源：

APPLIED ARTIFICIAL INTELLIGENCE | 2023年 / 37卷 / 01期

基金：

欧洲研究理事会;

关键词：

CONCEPT DRIFT; MODEL;

D O I：

10.1080/08839514.2023.2198846

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In many real-world scenarios, data are provided as a potentially infinite stream of samples that are subject to changes in the underlying data distribution, a phenomenon often referred to as concept drift. A specific facet of concept drift is feature drift, where the relevance of a feature to the problem at hand changes over time. High-dimensionality of the data poses an additional challenge to learning algorithms operating in such environments. Common scenarios of this nature can for example be found in sensor-based maintenance operations of industrial machines or inside entire networks, such as power grids or water distribution systems. However, since most existing methods for incremental learning focus on classification tasks, efficient online learning for regression is still an underdeveloped area. In this work, we introduce an extension to the SAM-kNN Regressor that incorporates metric learning in order to improve the prediction quality on data streams, gain insights into the relevance of different input features and based on that, transform the input data into a lower dimension in order to improve computational complexity and suitability for high-dimensional data. We evaluate our proposed method on artificial data, to demonstrate its applicability in various scenarios. In addition to that, we apply the method to the real-world problem of water distribution network monitoring. Specifically, we demonstrate that sensor faults in the water distribution network can be detected by monitoring the feature relevances computed by our algorithm.

引用

页数：30

共 50 条

[1] SAM-kNN Regressor for Online Learning in Water Distribution Networks
Jakob, Jonathan
Artelt, Andre
Hasenjaeger, Martina
Hammer, Barbara
[J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT III, 2022, 13531 : 752 - 762
[2] Balanced SAM-kNN: Online Learning with Heterogeneous Drift and Imbalanced Data
Vaquet, Valerie
Hammer, Barbara
[J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 850 - 862
[3] High-dimensional kNN joins with incremental updates
Cui Yu
Rui Zhang
Yaochun Huang
Hui Xiong
[J]. GeoInformatica, 2010, 14 : 55 - 82
[4] High-dimensional kNN joins with incremental updates
Yu, Cui
Zhang, Rui
Huang, Yaochun
Xiong, Hui
[J]. GEOINFORMATICA, 2010, 14 (01) : 55 - 82
[5] Interpretable Approximation of High-Dimensional Data
Potts, Daniel
Schmischke, Michael
[J]. SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2021, 3 (04): : 1301 - 1323
[6] Learning High-Dimensional Evolving Data Streams With Limited Labels
Din, Salah Ud
Kumar, Jay
Shao, Junming
Mawuli, Cobbinah Bernard
Ndiaye, Waldiodio David
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (11) : 11373 - 11384
[7] Bayesian evolutionary hypernetworks for interpretable learning from high-dimensional data
Kim, Soo-Jin
Ha, Jung-Woo
Kim, Heebal
Zhang, Byoung-Tak
[J]. APPLIED SOFT COMPUTING, 2019, 81
[8] Boosting for Vote Learning in High-dimensional kNN Classification
Tomasev, Nenad
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2014, : 676 - 683
[9] Scalable and Interpretable Data Representation for High-Dimensional, Complex Data
Kim, Been
Patel, Kayur
Rostamizadeh, Afshin
Shah, Julie
[J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 1763 - 1769
[10] Learning high-dimensional data
Verleysen, M
[J]. LIMITATIONS AND FUTURE TRENDS IN NEURAL COMPUTATION, 2003, 186 : 141 - 162

← 1 2 3 4 5 →