Data reduction in big data: a survey of methods, challenges and future directions

被引：0

作者：

Khoei, Tala Talaei ^{[1
]}

Singh, Aditi ^{[2
]}

机构：

[1] Northeastern Univ, Khoury Coll Comp Sci, Roux Inst, Portland, ME 04101 USA

[2] Cleveland State Univ, Washkewicz Coll Engn, Cleveland, OH USA

来源：

INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS | 2024年

关键词：

Artificial intelligence; Biometrics; Crime; Detection; Emotions; Facial recognition; Prediction; Policing; CLASSIFICATION; COMPRESSION;

D O I：

10.1007/s41060-024-00603-z

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Data reduction plays a pivotal role in managing and analyzing big data, which is characterized by its volume, velocity, variety, veracity, value, variability, and visibility. However, several surveys have been conducted to summarize these techniques in the field of big data, and there are several concerns that require attention, such as limited discussions of reduction techniques. Also, most of these studies focused on applications and only described their techniques. In contrast, this survey provides a comprehensive overview of data reduction methods, challenges, and future directions in the context of big data analytics in general concepts. This survey begins discussing the significance of data reduction in addressing the scalability and complexity issues inherent in big data processing. Subsequently, a classification data reduction method in big data is provided. For each category, the underlying principles, popular algorithms, and applications in big data analytics are highlighted. Moreover, the key challenges associated with data reduction in the era of big data, such as scalability, computational complexity, quality preservation, and interpretability, are found and discussed, while the importance of addressing these challenges to ensure the effectiveness and reliability of data reduction techniques in large-scale data analytics are reviewed. This survey can serve as a comprehensive reference for researchers, practitioners, and stakeholders interested in understanding and using data reduction techniques to address the challenges and opportunities posed by big data. Finally, tangible results of this study can be listed as introducing techniques for improving storage efficiency and faster computational processing by minimizing dataset size, while these techniques can enhance data analysis by removing redundancy and noise, leading to more accurate and actionable insights.

引用

页数：40

共 50 条

[31] Big Data: Survey, Technologies, Opportunities, and Challenges
Khan, Nawsher
Yaqoob, Ibrar
Hashem, Ibrahim Abaker Targio
Inayat, Zakira
Ali, Waleed KamaleldinMahmoud
Alam, Muhammad
Shiraz, Muhammad
Gani, Abdullah
[J]. SCIENTIFIC WORLD JOURNAL, 2014,
[32] Big data challenges in ocean observation: a survey
Yingjian Liu
Meng Qiu
Chao Liu
Zhongwen Guo
[J]. Personal and Ubiquitous Computing, 2017, 21 : 55 - 65
[33] Big data challenges in ocean observation: a survey
Liu, Yingjian
Qiu, Meng
Liu, Chao
Guo, Zhongwen
[J]. PERSONAL AND UBIQUITOUS COMPUTING, 2017, 21 (01) : 55 - 65
[34] Challenges in Big Data Analytics Techniques: A Survey
Komalavalli, C.
Laroiya, Chetna
[J]. 2019 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2019), 2019, : 223 - 228
[35] Big data applications: overview, challenges and future
Badshah, Afzal
Daud, Ali
Alharbey, Riad
Banjar, Ameen
Bukhari, Amal
Alshemaimri, Bader
[J]. ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (11)
[36] Big Data: Current Challenges and Future Scope
Ashabi, Ardavan
Bin Sahibuddin, Shamsul
Haghighi, Mehdi Salkhordeh
[J]. IEEE 10TH SYMPOSIUM ON COMPUTER APPLICATIONS AND INDUSTRIAL ELECTRONICS (ISCAIE 2020), 2020, : 131 - 134
[37] Big Earth Data for quantitative measurement of community resilience: current challenges, progresses and future directions
Qiang, Yi
Zou, Lei
Cai, Heng
[J]. BIG EARTH DATA, 2023, 7 (04) : 1035 - 1057
[38] Analytics of location-based big data for smart cities: Opportunities, challenges, and future directions
Huang, Haosheng
Yao, Xiaobai Angela
Krisp, Jukka M.
Jiang, Bin
[J]. COMPUTERS ENVIRONMENT AND URBAN SYSTEMS, 2021, 90 (90)
[39] Big data analytics in smart grids: state-of-the-art, challenges, opportunities, and future directions
Bhattarai, Bishnu P.
Paudyal, Sumit
Luo, Yusheng
Mohanpurkar, Manish
Cheung, Kwok
Tonkoski, Reinaldo
Hovsapian, Rob
Myers, Kurt S.
Zhang, Rui
Zhao, Power
Manic, Milos
Zhang, Song
Zhang, Xiaping
[J]. IET SMART GRID, 2019, 2 (02) : 141 - 154
[40] Big Data Analytics-Enhanced Cloud Computing: Challenges, Architectural Elements, and Future Directions
Buyya, Rajkumar
Ramamohanarao, Kotagiri
Leckie, Chris
Calheiros, Rodrigo N.
Dastjerdi, Amir Vahid
Versteeg, Steve
[J]. 2015 IEEE 21ST INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2015, : 75 - 84

← 1 2 3 4 5 →