Cleaning Big Data Streams: A Systematic Literature Review

被引:6
|
作者
Alotaibi, Obaid [1 ,2 ]
Pardede, Eric [2 ]
Tomy, Sarath [3 ]
Bagui, Sikha
Iacono, Mauro
机构
[1] Shaqra Univ, Coll Arts & Sci, Dept Comp Sci, Sajir Campus, Sajir City 11951, Saudi Arabia
[2] La Trobe Univ, Sch Engn & Math Sci, Dept Comp Sci & Informat Technol, Melbourne Campus, Melbourne, Vic 3086, Australia
[3] La Trobe Univ, Sch Engn & Math Sci, Dept Comp Sci & Informat Technol, Bendigo Campus, Flora Hill, Vic 3552, Australia
关键词
clean; big data; stream; machine learning; deep learning; artificial intelligence; missing value; outliers; duplicate data; irrelevant data; OUTLIER DETECTION; ANOMALY DETECTION; FRAMEWORK;
D O I
10.3390/technologies11040101
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In today's big data era, cleaning big data streams has become a challenging task because of the different formats of big data and the massive amount of big data which is being generated. Many studies have proposed different techniques to overcome these challenges, such as cleaning big data in real time. This systematic literature review presents recently developed techniques that have been used for the cleaning process and for each data cleaning issue. Following the PRISMA framework, four databases are searched, namely IEEE Xplore, ACM Library, Scopus, and Science Direct, to select relevant studies. After selecting the relevant studies, we identify the techniques that have been utilized to clean big data streams and the evaluation methods that have been used to examine their efficiency. Also, we define the cleaning issues that may appear during the cleaning process, namely missing values, duplicated data, outliers, and irrelevant data. Based on our study, the future directions of cleaning big data streams are identified.
引用
收藏
页数:24
相关论文
共 50 条
  • [21] Big Data: Opportunities and Challenges in Libraries, a Systematic Literature Review
    Garoufallou, Emmanouel
    Gaitanou, Panorea
    COLLEGE & RESEARCH LIBRARIES, 2021, 82 (03): : 410 - 435
  • [22] Big Data in Food: Systematic Literature Review and Future Directions
    Chakraborty, Debarun
    Rana, Nripendra P.
    Khorana, Sangeeta
    Singu, Hari Babu
    Luthra, Sunil
    JOURNAL OF COMPUTER INFORMATION SYSTEMS, 2023, 63 (05) : 1243 - 1263
  • [23] Big data and analytics in hospitality and tourism: a systematic literature review
    Mariani, Marcello
    Baggio, Rodolfo
    INTERNATIONAL JOURNAL OF CONTEMPORARY HOSPITALITY MANAGEMENT, 2022, 34 (01) : 231 - 278
  • [24] Positive deviance, big data, and development: A systematic literature review
    Albanna, Basma
    Heeks, Richard
    ELECTRONIC JOURNAL OF INFORMATION SYSTEMS IN DEVELOPING COUNTRIES, 2019, 85 (01):
  • [25] Big Data Maturity Assessment Models: A Systematic Literature Review
    Al-Sai, Zaher Ali
    Husin, Mohd Heikal
    Syed-Mohamad, Sharifah Mashita
    Abdullah, Rosni
    Zitar, Raed Abu
    Abualigah, Laith
    Gandomi, Amir H.
    BIG DATA AND COGNITIVE COMPUTING, 2023, 7 (01)
  • [26] Critical Success Factors for Big Data: A Systematic Literature Review
    Al-Sai, Zaher Ali
    Abdullah, Rosni
    Husin, Mohd Heikal
    IEEE ACCESS, 2020, 8 : 118940 - 118956
  • [27] Big data applications on the Internet of Things: A systematic literature review
    Ahmadova, Ulkar
    Mustafayev, Mustafa
    Kiani Kalejahi, Behnam
    Saeedvand, Saeed
    Rahmani, Amir Masoud
    INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2021, 34 (18)
  • [28] Machine Learning and Big Data for Cybersecurity: Systematic Literature Review
    El Bouchtioui, En Naji
    Bentaleb, Asmae
    Abouchabaka, Jaafar
    DIGITAL TECHNOLOGIES AND APPLICATIONS, ICDTA 2024, VOL 1, 2024, 1098 : 97 - 106
  • [29] Privacy Prevention of Big Data Applications: A Systematic Literature Review
    Rafiq, Fatima
    Awan, Mazhar Javed
    Yasin, Awais
    Nobanee, Haitham
    Zain, Azlan Mohd
    Bahaj, Saeed Ali
    SAGE OPEN, 2022, 12 (02):
  • [30] A systematic literature review on the use of big data for sustainable tourism
    Rahmadian, E.
    Feitosa, D.
    Zwitter, A.
    CURRENT ISSUES IN TOURISM, 2022, 25 (11) : 1711 - 1730