Data Augmentation On-the-fly and Active Learning in Data Stream Classification

被引:2
|
作者
Malialisa, Kleanthis [1 ,2 ]
Papatheodoulou, Dimitris [1 ]
Filippou, Stylianos [1 ]
Panayiotou, Christos G. [1 ,2 ]
Polycarpou, Marios M. [1 ,2 ]
机构
[1] Univ Cyprus, KIOS Res & Innovat Ctr Excellence, Nicosia, Cyprus
[2] Univ Cyprus, Dept Elect & Comp Engn, Nicosia, Cyprus
基金
欧洲研究理事会;
关键词
incremental learning; active learning; data streams; class imbalance; neural networks;
D O I
10.1109/SSCI51031.2022.10022133
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There is an emerging need for predictive models to be trained on-the-fly, since in numerous machine learning applications data are arriving in an online fashion. A critical challenge encountered is that of limited availability of ground truth information (e.g., labels in classification tasks) as new data are observed one-by-one online, while another significant challenge is that of class imbalance. This work introduces the novel Augmented Queues method, which addresses the dual-problem by combining in a synergistic manner online active learning, data augmentation, and a multi-queue memory to maintain separate and balanced queues for each class. We perform an extensive experimental study using image and time-series augmentations, in which we examine the roles of the active learning budget, memory size, imbalance level, and neural network type. We demonstrate two major advantages of Augmented Queues. First, it does not reserve additional memory space as the generation of synthetic data occurs only at training times. Second, learning models have access to more labelled data without the need to increase the active learning budget and / or the original memory size. Learning on-the-fly poses major challenges which, typically, hinder the deployment of learning models. Augmented Queues significantly improves the performance in terms of learning quality and speed. Our code is made publicly available.
引用
收藏
页码:1408 / 1414
页数:7
相关论文
共 50 条
  • [1] On-the-Fly Aligned Data Augmentation for Sequence-to-Sequence ASR
    Lam, Tsz Kin
    Ohta, Mayumi
    Schamoni, Shigehiko
    Riezler, Stefan
    [J]. INTERSPEECH 2021, 2021, : 1299 - 1303
  • [2] ON-THE-FLY DATA AUGMENTATION FOR TEXT-TO-SPEECH STYLE TRANSFER
    Chung, Raymond
    Mak, Brian
    [J]. 2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 634 - 641
  • [3] GuidedMix: An on-the-fly data augmentation approach for robust speaker recognition system
    Xiao, Runqiu
    Li, Zhuo
    Miao, Xiaoxiao
    Wang, Wenchao
    Zhang, Pengyuan
    [J]. ELECTRONICS LETTERS, 2022, 58 (02) : 82 - 85
  • [4] Clustering-based Active Learning Classification towards Data Stream
    Yin, Chunyong
    Chen, Shuangshuang
    Yin, Zhichao
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2023, 14 (02)
  • [5] On-the-fly Data Transformation in Action
    Mun, Ju Hyoung
    Karatsenidis, Konstantinos
    Papon, Tarikul Islam
    Roozkhosh, Shahin
    Hoornaert, Denis
    Drepper, Ulrich
    Sanaullah, Ahmed
    Mancuso, Renato
    Athanassoulis, Manos
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (12): : 3950 - 3953
  • [6] IMPROVING SEQUENCE-TO-SEQUENCE SPEECH RECOGNITION TRAINING WITH ON-THE-FLY DATA AUGMENTATION
    Nguyen, Thai-Son
    Stuker, Sebastian
    Niehues, Jan
    Waibel, Alex
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7689 - 7693
  • [7] Active broad learning with multi-objective evolution for data stream classification
    Cheng, Jian
    Zheng, Zhiji
    Guo, Yinan
    Pu, Jiayang
    Yang, Shengxiang
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (01) : 899 - 916
  • [8] Sleep Disorder Data Stream Classification Based on Classifiers Ensemble and Active Learning
    Cai, Liangming
    Datta, Rituparna
    Huang, Jingshan
    Dong, Shuai
    Du, Min
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 1432 - 1435
  • [9] Nonstationary data stream classification with online active learning and siamese neural networks
    Malialis, Kleanthis
    Panayiotou, Christos G.
    Polycarpou, Marios M.
    [J]. NEUROCOMPUTING, 2022, 512 : 235 - 252
  • [10] Active broad learning with multi-objective evolution for data stream classification
    Jian Cheng
    Zhiji Zheng
    Yinan Guo
    Jiayang Pu
    Shengxiang Yang
    [J]. Complex & Intelligent Systems, 2024, 10 : 899 - 916