Hierarchical classification of data streams: a systematic literature review

被引:9
|
作者
Tieppo, Eduardo [1 ,2 ]
dos Santos, Roger Robson [2 ]
Barddal, Jean Paul [2 ]
Nievola, Julio Cesar [2 ]
机构
[1] Inst Fed Parana IFPR, Campus Pinhais, Pinhais, Brazil
[2] Pontificia Univ Catolica Parana PUCPR, Posgrad Informat PPGIa, Curitiba, Parana, Brazil
关键词
Data stream mining; Hierarchical classification; Systematic literature review; Machine learning; ACTIVITY RECOGNITION; OBJECT RECOGNITION; CLASSIFIERS; MACHINE; REPRESENTATION; PERFORMANCE; ALGORITHM; AGREEMENT; QUALITY; DRIFT;
D O I
10.1007/s10462-021-10087-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The classification task usually works with flat and batch learners, assuming problems as stationary and without relations between class labels. Nevertheless, several real-world problems do not assume these premises, i.e., data have labels organized hierarchically and are made available in streaming fashion, meaning that their behavior can drift over time. Existing studies on hierarchical classification do not consider data streams as input of their process, and thus, data is assumed as stationary and handled through batch learners. The same can be said about works on streaming data, as the hierarchical classification is overlooked. Studies concerning each area individually are promising, yet, do not tackle their intersection. This study analyzes the main characteristics of the state-of-the-art works on hierarchical classification for streaming data concerning five aspects: (i) problems tackled, (ii) datasets, (iii) algorithms, (iv) evaluation metrics, and (v) research gaps in the area. We performed a systematic literature review of primary studies and retrieved 3,722 papers, of which 42 were identified as relevant and used to answer the aforementioned research questions. We found that the problems handled by hierarchical classification of data streams include mainly classification of images, human activities, texts, and audio; the datasets are mostly created or synthetic data; the algorithms and evaluation metrics are well-known techniques or based on those; and research gaps are related to dynamic context, data complexity, and computational resources constraints. We also provide implications for future research and experiments to consider common characteristics shared amongst hierarchical classification and data stream classification.
引用
收藏
页码:3243 / 3282
页数:40
相关论文
共 50 条
  • [21] High-Speed Big Data Streams: A Literature Review
    Sneha, R. Patil
    Nagaraj, V. Dharwadkar
    SECOND INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS AND COMMUNICATION TECHNOLOGIES, ICCNCT 2019, 2020, 44 : 308 - 316
  • [22] Plant Disease Detection and Classification: A Systematic Literature Review
    Ramanjot
    Mittal, Usha
    Wadhawan, Ankita
    Singla, Jimmy
    Jhanjhi, N. Z. M.
    Ghoniem, Rania
    Ray, Sayan Kumar
    Abdelmaboud, Abdelzahir
    SENSORS, 2023, 23 (10)
  • [23] Indexes of patent value: a systematic literature review and classification
    Grimaldi, Michele
    Cricelli, Livio
    KNOWLEDGE MANAGEMENT RESEARCH & PRACTICE, 2020, 18 (02) : 214 - 233
  • [24] A systematic literature review on spam content detection and classification
    Kaddoura, Sanaa
    Chandrasekaran, Ganesh
    Popescu, Daniela Elena
    Duraisamy, Jude Hemanth
    PEERJ COMPUTER SCIENCE, 2022, 8
  • [25] A systematic literature review on spam content detection and classification
    Kaddoura S.
    Chandrasekaran G.
    Popescu D.E.
    Duraisamy J.H.
    PeerJ Computer Science, 2022, 8
  • [26] Classification of pain in cancer patients - a systematic literature review
    Knudsen, A. K.
    Aass, N.
    Fainsinger, R.
    Caraceni, A.
    Klepstad, P.
    Jordhoy, M.
    Hjermstad, M. J.
    Kaasa, S.
    PALLIATIVE MEDICINE, 2009, 23 (04) : 295 - 308
  • [27] Classification of Malware Analytics Techniques: A Systematic Literature Review
    Hordri, Nur Farhana
    Ahmad, Noor Azurati
    Yuhaniz, Siti Sophiayati
    Sahibuddin, Shamsul
    Ariffin, Aswami Fadillah Mohd
    Saupi, Nur Afifah Mohd
    Zamani, Nazri Ahmad
    Jeffry, Yasmin
    Senan, Mohamad Firham Efendy Md
    INTERNATIONAL JOURNAL OF SECURITY AND ITS APPLICATIONS, 2018, 12 (02): : 9 - 18
  • [28] Computing Hierarchical Summary of the Data Streams
    Shah, Zubair
    Mahmood, Abdun Naser
    Barlow, Michael
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2016, PT II, 2016, 9652 : 168 - 179
  • [29] Data literacy assessments: a systematic literature review
    Cui, Ying
    Chen, Fu
    Lutsyk, Alina
    Leighton, Jacqueline P.
    Cutumisu, Maria
    ASSESSMENT IN EDUCATION-PRINCIPLES POLICY & PRACTICE, 2023, 30 (01) : 76 - 96
  • [30] Data Mesh: A Systematic Gray Literature Review
    Goedegebuure, Abel
    Kumara, Indika
    Driessen, Stefan
    van den Heuvel, Willem-jan
    Monsieur, Geert
    Tamburri, Damian andrew
    DI Nucci, Dario
    ACM COMPUTING SURVEYS, 2025, 57 (01)