Improvised methods for tackling big data stream mining challenges: case study of human activity recognition

被引:0
|
作者
Simon Fong
Kexing Liu
Kyungeun Cho
Raymond Wong
Sabah Mohammed
Jinan Fiaidhi
机构
[1] University of Macau,Department of Computer and Information Science
[2] Dongguk University,Department of Multimedia Engineering
[3] University of New South Wales,School of Computer Science and Engineering
[4] Lakehead University,Department of Computer Science
来源
关键词
Data stream mining; Big data; Very fast decision tree; Resampling; Sensor data;
D O I
暂无
中图分类号
学科分类号
摘要
Big data stream is a new hype but a practical computational challenge founded on data streams that are prevalent in applications nowadays. It is quite well known that data streams that are originated and collected from monitoring sensors accumulate continuously to a very huge amount making traditional batch-based model induction algorithms infeasible for real-time data mining or just-in-time data analytics. In this position paper, following a new data stream mining methodology, namely stream-based holistic analytics and reasoning in parallel (SHARP), a list of data analytic challenges as well as improvised methods are looked into. In particular, two types of decision tree algorithms, batch-mode and incremental-mode, are put under test at sensor data that represents a typical big data stream. We investigate whether and to what extent of two improvised methods—outlier removal and balancing imbalanced class distributions—affect the prediction performance in big data stream mining. SHARP is founded on incremental learning which does not require all the training to be loaded into the memory. This important fundamental concept needs to be supported not only by the decision tree algorithms, but by the other improvised methods usually at the preprocessing stage as well. This paper sheds some light into this area which is often overlooked by data analysts when it comes to big data stream mining.
引用
收藏
页码:3927 / 3959
页数:32
相关论文
共 50 条
  • [1] Improvised methods for tackling big data stream mining challenges: case study of human activity recognition
    Fong, Simon
    Liu, Kexing
    Cho, Kyungeun
    Wong, Raymond
    Mohammed, Sabah
    Fiaidhi, Jinan
    [J]. JOURNAL OF SUPERCOMPUTING, 2016, 72 (10): : 3927 - 3959
  • [2] Data stream mining: methods and challenges for handling concept drift
    Scott Wares
    John Isaacs
    Eyad Elyan
    [J]. SN Applied Sciences, 2019, 1
  • [3] Data stream mining: methods and challenges for handling concept drift
    Wares, Scott
    Isaacs, John
    Elyan, Eyad
    [J]. SN APPLIED SCIENCES, 2019, 1 (11)
  • [4] Privacy-Preserving Big Data Stream Mining: Opportunities, Challenges, Directions
    Cuzzocrea, Alfredo
    [J]. 2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2017), 2017, : 992 - 994
  • [5] Comparing Sampling Strategies for Tackling Imbalanced Data in Human Activity Recognition
    Alharbi, Fayez
    Ouarbya, Lahcen
    Ward, Jamie A.
    [J]. SENSORS, 2022, 22 (04)
  • [6] A Case Study of Medical Big Data Processing: Data Mining for the Hyperuricemia
    Tan, Junyan
    Xiong, Tianyu
    Miao, Hongxia
    Sun, Rurong
    Wu, Min
    [J]. 2018 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA), 2018, : 196 - 201
  • [7] Human activity recognition in big data smart home context
    Azzi, Sabrina
    Dallaire, Cindy
    Bouzouane, Abdenour
    Bouchard, Bruno
    Giroux, Sylvain
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014,
  • [8] Comparative Study of Various Decision Tree Methods for Data Stream Mining
    Mehta, Vaishali
    Sanghavi, Vishakha
    [J]. THIRD INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, 2019, 797 : 371 - 379
  • [9] Smart Phone Based Data Mining For Human Activity Recognition
    Chetty, Girija
    White, Matthew
    Akther, Farnaz
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES, ICICT 2014, 2015, 46 : 1181 - 1187
  • [10] A Benchmark of Data Stream Classification for Human Activity Recognition on Connected Objects
    Khannouz, Martin
    Glatard, Tristan
    [J]. SENSORS, 2020, 20 (22) : 1 - 17