Online Clustering for Novelty Detection and Concept Drift in Data Streams

被引:6
|
作者
Garcia, Kemilly Dearo [1 ,2 ]
Poel, Mannes [1 ]
Kok, Joost N. [1 ]
de Carvalho, Andre C. P. L. F. [2 ]
机构
[1] Univ Twente, Enschede, Netherlands
[2] Univ Sao Paulo, ICMC, Sao Paulo, Brazil
来源
关键词
Data stream; Concept drift; Novelty detection; Online learning; CLASSIFICATION;
D O I
10.1007/978-3-030-30244-3_37
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data streams are related to large amounts of data that can continuously arrive with a probability distribution that may change over time. Depending on the changes in the data distribution, different phenomena can occur, like new classes can appear or concept drift can occur in existing classes. Machine Learning algorithms have been often used to model this data. New classes are patterns that were not seen during the training of the current classification model, but appear after some time. Concept drift occurs when the concepts associated with a dataset change as new data arrive. This paper proposes a new algorithm based on kNN that uses micro-clusters as prototypes and incrementally updates the micro-clusters or creates new micro-clusters when novelties are detected. In the online phase, each instance close to a micro-cluster is considered an extension of the micro-cluster, being used to adapt the model to concept drift. The proposed algorithm is experimentally compared with a stateof-the-art classifier from the data stream literature and one baseline. According to the experimental results, the proposed algorithm increases the predictive performance over time by incrementally learning changes in the data distribution.
引用
收藏
页码:448 / 459
页数:12
相关论文
共 50 条
  • [1] Concept drift robust adaptive novelty detection for data streams
    Cejnek, Matous
    Bukovsky, Ivo
    NEUROCOMPUTING, 2018, 309 : 46 - 53
  • [2] On Fuzzy Clustering of Data Streams with Concept Drift
    Jaworski, Maciej
    Duda, Piotr
    Pietruczuk, Lena
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT II, 2012, 7268 : 82 - 91
  • [3] Preface to incremental clustering, concept drift and novelty detection
    Cuxac, Pascal
    Lamirel, Jean-Charles
    Lemaire, Vincent
    Mahmoud, Abou-Nasr
    Shadi, Al Shehabi
    Albatineh, Ahmed N.
    Cesare, Alippi
    Tomas, Arredondo
    Younes, Bennani
    Albert, Bifet
    Alexis, Bondu
    Guenael, Cabanes
    Nitesh, Chawla
    Chaomei, Chen
    Pascal, Cuxac
    Diallo, Abdoulaye B.
    Anass, El Haddadi
    Hugo, Escalante
    José, García-Rodríguez
    Wolfgang, Glanzel
    Barbara, Hammer
    Kumova, Bora I.
    Pascale, Kuntz-Cosperec
    Stephane, Lallich
    Jean-Charles, Lamirel
    Mustapha, Lebbah
    Vincent, Lemaire
    Philippe, Lenca
    Bin, Li
    Rebecca, Nuggent
    Florin, Popescu
    Manuel, Roveri
    Dan, Tamir
    Fabien, Torre
    Zhi-Hua, Zhou
    Tanguy, Urvoy
    Xingquan, Zhu
    Proceedings - IEEE 13th International Conference on Data Mining Workshops, ICDMW 2013, 2013,
  • [4] Ensemble Clustering for Novelty Detection in Data Streams
    Garcia, Kemilly Dearo
    de Faria, Elaine Ribeiro
    de Sa, Claudio Rebelo
    Mendes-Moreira, Joao
    Aggarwal, Charu C.
    de Carvalho, Andre C. P. L. F.
    Kok, Joost N.
    DISCOVERY SCIENCE (DS 2019), 2019, 11828 : 460 - 470
  • [5] Conceptual clustering and its application to concept drift and novelty detection
    Fanizzi, Nicola
    d'Amato, Claudia
    Esposito, Floriana
    SEMANTIC WEB: RESEARCH AND APPLICATIONS, PROCEEDINGS, 2008, 5021 : 318 - 332
  • [6] Online Feature Screening for Data Streams With Concept Drift
    Wang, Mingyuan
    Barbu, Adrian
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (11) : 11693 - 11707
  • [7] An online ensembles approach for handling concept drift in data streams: diversified online ensembles detection
    Parneeta Sidhu
    M. P. S. Bhatia
    International Journal of Machine Learning and Cybernetics, 2015, 6 : 883 - 909
  • [8] An online ensembles approach for handling concept drift in data streams: diversified online ensembles detection
    Sidhu, Parneeta
    Bhatia, M. P. S.
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2015, 6 (06) : 883 - 909
  • [9] Novelty Detection and Online Learning for Chunk Data Streams
    Wang, Yi
    Ding, Yi
    He, Xiangjian
    Fan, Xin
    Lin, Chi
    Li, Fengqi
    Wang, Tianzhu
    Luo, Zhongxuan
    Luo, Jiebo
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (07) : 2400 - 2412
  • [10] Predicting concept drift in data streams using metadata clustering
    Anderson, Robert
    Koh, Yun Sing
    Dobbie, Gillian
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,