Incremental k-Nearest Neighbors Using Reservoir Sampling for Data Streams

被引:1
|
作者
Bahri, Maroua [1 ]
Bifet, Albert [1 ,2 ]
机构
[1] IP Paris, LTCI, Telecom Paris, Paris, France
[2] Univ Waikato, Hamilton, New Zealand
来源
DISCOVERY SCIENCE (DS 2021) | 2021年 / 12986卷
关键词
Data stream classification; K-nearest neighbors; Reservoir sampling; Sliding window;
D O I
10.1007/978-3-030-88942-5_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The online and potentially infinite nature of data streams leads to the inability to store the flow in its entirety and thus restricts the storage to a part of - and/or synopsis information from - the stream. To process these evolving data, we need efficient and accurate methodologies and systems, such as window models (e.g., sliding windows) and summarization techniques (e.g., sampling, sketching, dimensionality reduction). In this paper, we propose, RW-kNN, a k-Nearest Neighbors (kNN) algorithm that employs a practical way to store information about past instances using the biased reservoir sampling to sample the input instances along with a sliding window to maintain the most recent instances from the stream. We evaluate our proposal on a diverse set of synthetic and real datasets and compare against state-of-the-art algorithms in a traditional test-then-train evaluation. Results show how our proposed RW-kNN approach produces high-predictive performance for both real and synthetic datasets while using a feasible amount of resources.
引用
收藏
页码:122 / 137
页数:16
相关论文
共 50 条
  • [31] Movie Recommender System Using K-Nearest Neighbors Variants
    Airen, Sonu
    Agrawal, Jitendra
    NATIONAL ACADEMY SCIENCE LETTERS-INDIA, 2022, 45 (01): : 75 - 82
  • [32] Wind power forecasting using the k-nearest neighbors algorithm
    Mangalova, E.
    Agafonov, E.
    INTERNATIONAL JOURNAL OF FORECASTING, 2014, 30 (02) : 402 - 406
  • [33] Introduction to machine learning: k-nearest neighbors
    Zhang, Zhongheng
    ANNALS OF TRANSLATIONAL MEDICINE, 2016, 4 (11)
  • [34] Machine learning classification based on k-Nearest Neighbors for PolSAR data
    Ferreira, Jodavid A.
    Rodrigues, Anny K. G.
    Ospina, Raydonal
    Gomez, Luis
    ANAIS DA ACADEMIA BRASILEIRA DE CIENCIAS, 2024, 96 (01):
  • [35] Classification of incomplete data based on belief functions and K-nearest neighbors
    Liu, Zhun-ga
    Liu, Yong
    Dezert, Jean
    Pan, Quan
    KNOWLEDGE-BASED SYSTEMS, 2015, 89 : 113 - 125
  • [36] K-Nearest Neighbors Undersampling as Balancing Data for Cyber Troll Detection
    Luqyana, Wanda Athira
    Ahmadie, Beryl Labique
    Supianto, Ahmad Afif
    PROCEEDINGS OF 2019 4TH INTERNATIONAL CONFERENCE ON SUSTAINABLE INFORMATION ENGINEERING AND TECHNOLOGY (SIET 2019), 2019, : 322 - 325
  • [37] K-Nearest Neighbors Classifier for Field Bit Error Rate Data
    Allogba, Stephanie
    Tremblay, Christine
    2018 ASIA COMMUNICATIONS AND PHOTONICS CONFERENCE (ACP), 2018,
  • [38] Random kernel k-nearest neighbors regression
    Srisuradetchai, Patchanok
    Suksrikran, Korn
    FRONTIERS IN BIG DATA, 2024, 7
  • [39] The research on an adaptive k-nearest neighbors classifier
    Yu, Xiao-Gao
    Yu, Xiao-Peng
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 1241 - 1246
  • [40] PATCH CONFIDENCE K-NEAREST NEIGHBORS DENOISING
    Angelino, Cesario V.
    Debreuve, Eric
    Barlaud, Michel
    2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 1129 - 1132