Online Active Learning in Data Stream Regression Using Uncertainty Sampling Based on Evolving Generalized Fuzzy Models

被引:78
|
作者
Lughofer, Edwin [1 ]
Pratama, Mahardhika [2 ]
机构
[1] Johannes Kepler Univ Linz, Dept Knowledge Based Math Syst, A-4040 Linz, Austria
[2] La Trobe Univ, Sch Engn & Math Sci, Bundoora, Vic 3086, Australia
关键词
Active learning latency buffer (ALLB); data stream regression; evolving generalized Takagi-Sugeno (TS) fuzzy systems; extrapolation degree; nonlinearity degree; online active learning; single-pass uncertainty-based sampling; uncertainty in model outputs and parameters; VISUAL INSPECTION; SYSTEMS; IDENTIFICATION; NETWORK; DESIGN;
D O I
10.1109/TFUZZ.2017.2654504
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose three criteria for efficient sample selection in case of data stream regression problems within an online active learning context. The selection becomes important whenever the target values, which guide the update of the regressors as well as the implicit model structures, are costly or time-consuming to measure and also in case when very fast models updates are required to cope with stream mining real-time demands. Reducing the selected samples as much as possible while keeping the predictive accuracy of the models on a high level is, thus, a central challenge. This should be ideally achieved in unsupervised and single-pass manner. Our selection criteria rely on three aspects: 1) the extrapolation degree combined with the model's nonlinearity degree, which is measured in terms of a new specific homogeneity criterion among adjacent local approximators; 2) the uncertainty in model outputs, which can be measured in terms of confidence intervals using so-called adaptive local error bars-we integrate a weighted localization of an incremental noise level estimator and propose formulas for online merging of local error bars; 3) the uncertainty in model parameters, which is estimated by the so-called A-optimality criterion, which relies on the Fisher information matrix. The selection criteria are developed in combination with evolving generalized Takagi-Sugeno (TS) fuzzy models (containing rules in arbitrarily rotated position), as it could be shown in previous publications that these outperform conventional evolving TS models (containing axis-parallel rules). The results based on three high-dimensional real-world streaming problems show that a model update based on only 10%-20% selected samples can still achieve similar accumulated model errors over time to the case when performing a full model update on all samples. This can be achieved with a negligible sensitivity on the size of the active learning latency buffer. Random sampling with the same percentages of samples selected, however, achieved much higher error rates. Hence, the intelligence in our sample selection concept leads to an economic balance between model accuracy and measurement as well computational costs for model updates.
引用
收藏
页码:292 / 309
页数:18
相关论文
共 50 条
  • [1] Efficient Sample Selection in Data Stream Regression employing Evolving Generalized Fuzzy Models
    Lughofer, Edwin
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
  • [2] A Novel Sampling Strategy for Active Learning over Evolving Stream Data
    Zhang, Xuxu
    Cao, Zhi
    Peng, Li
    Ren, Siqi
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING, INFORMATION SCIENCE & APPLICATION TECHNOLOGY (ICCIA 2017), 2017, 74 : 348 - 354
  • [3] Online active learning for an evolving fuzzy neural classifier based on data density and specificity
    Souza, Paulo Vitor de Campos
    Lughofer, Edwin
    [J]. NEUROCOMPUTING, 2022, 512 : 269 - 286
  • [4] An evolving connectionist system for data stream fuzzy clustering and its online learning
    Bodyanskiy, Yevgeniy V.
    Tyshchenko, Oleksii K.
    Kopaliani, Daria S.
    [J]. NEUROCOMPUTING, 2017, 262 : 41 - 56
  • [5] Incremental Rule Splitting in Generalized Evolving Fuzzy Regression Models
    Lughofer, Edwin
    Pratama, Mahardhika
    Skrjanc, Igor
    [J]. PROCEEDINGS OF THE 2017 EVOLVING AND ADAPTIVE INTELLIGENT SYSTEMS (EAIS), 2017,
  • [6] Resolving Global and Local Drifts in Data Stream Regression using Evolving Rule-Based Models
    Shaker, Ammar
    Lughofer, Edwin
    [J]. PROCEEDINGS OF THE 2013 IEEE CONFERENCE ON EVOLVING AND ADAPTIVE INTELLIGENT SYSTEMS (EAIS), 2013, : 9 - 16
  • [7] Online Learning and Prediction of Data Streams using Dynamically Evolving Fuzzy Approach
    Baruah, Rashmi Dutta
    Angelov, Plamen
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ - IEEE 2013), 2013,
  • [8] Active learning for regression using greedy sampling
    Wu, Dongrui
    Lin, Chin-Teng
    Huang, Jian
    [J]. INFORMATION SCIENCES, 2019, 474 : 90 - 105
  • [9] Active Learning With Sampling by Uncertainty and Density for Data Annotations
    Zhu, Jingbo
    Wang, Huizhen
    Tsou, Benjamin K.
    Ma, Matthew
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1323 - 1331
  • [10] A buffer-based online clustering for evolving data stream
    Islam, Md. Kamrul
    Ahmed, Md. Manjur
    Zamli, Kamal Z.
    [J]. INFORMATION SCIENCES, 2019, 489 : 113 - 135