Distance variable improvement of time-series big data stream evaluation

被引:3
|
作者
Wibisono, Ari [1 ]
Mursanto, Petrus [1 ]
Adibah, Jihan [1 ]
Bayu, Wendy D. W. T. [1 ]
Rizki, May Iffah [1 ]
Hasani, Lintang Matahari [1 ]
Ahli, Valian Fil [1 ]
机构
[1] Univ Indonesia, Fac Comp Sci, Ui Depok 16424, Depok, Indonesia
关键词
Intelligent Systems; Data stream; Distance improvement; Big data regression; PREDICTION;
D O I
10.1186/s40537-020-00359-w
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Real-time information mining of a big dataset consisting of time series data is a very challenging task. For this purpose, we propose using the mean distance and the standard deviation to enhance the accuracy of the existing fast incremental model tree with the drift detection (FIMT-DD) algorithm. The standard FIMT-DD algorithm uses the Hoeffding bound as its splitting criterion. We propose the further use of the mean distance and standard deviation, which are used to split a tree more accurately than the standard method. We verify our proposed method using the large Traffic Demand Dataset, which consists of 4,000,000 instances; Tennet's big wind power plant dataset, which consists of 435,268 instances; and a road weather dataset, which consists of 30,000,000 instances. The results show that our proposed FIMT-DD algorithm improves the accuracy compared to the standard method and Chernoff bound approach. The measured errors demonstrate that our approach results in a lower Mean Absolute Percentage Error (MAPE) in every stage of learning by approximately 2.49% compared with the Chernoff Bound method and 19.65% compared with the standard method.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Distance variable improvement of time-series big data stream evaluation
    Ari Wibisono
    Petrus Mursanto
    Jihan Adibah
    Wendy D. W. T. Bayu
    May Iffah Rizki
    Lintang Matahari Hasani
    Valian Fil Ahli
    [J]. Journal of Big Data, 7
  • [2] Time-Series Big Data Stream Evaluation
    Mursanto, Petrus
    Wibisono, Ari
    Bayu, Wendy D. W. T.
    Ahli, Valian Fil
    Rizki, May Iffah
    Hasani, Lintang Matahari
    Adibah, Jihan
    [J]. 2020 5TH INTERNATIONAL WORKSHOP ON BIG DATA AND INFORMATION SECURITY (IWBIS 2020), 2020, : 43 - 47
  • [3] Kennard-Stone Balance Algorithm for Time-series Big Data Stream Mining
    Li, Tengyue
    Fong, Simon
    Wu, Yaoyang
    Tallon-Ballesteros, Antonio J.
    [J]. 20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2020), 2020, : 851 - 858
  • [4] Mining and Forecasting of Big Time-series Data
    Sakurai, Yasushi
    Matsubara, Yasuko
    Faloutsos, Christos
    [J]. SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 919 - 922
  • [5] Mining Big Time-series Data on the Web
    Sakurai, Yasushi
    Matsubara, Yasuko
    Faloutsos, Christos
    [J]. PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'16 COMPANION), 2016, : 1029 - 1032
  • [6] Mining and Forecasting of Big Time-series Data
    Sakurai, Yasushi
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS (PERCOM WORKSHOPS), 2019, : 607 - 607
  • [7] Big Data Analysis for Sensor Time-Series in Automation
    Jirkovsky, Vaclav
    Obitko, Marek
    Novak, Petr
    Kadera, Petr
    [J]. 2014 IEEE EMERGING TECHNOLOGY AND FACTORY AUTOMATION (ETFA), 2014,
  • [8] Real Time Interpretation and Optimization of Time Series Data Stream in Big Data
    Jiang, Zheyuan
    Liu, Ke
    [J]. 2018 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA), 2018, : 243 - 247
  • [9] Real-time analysis and management of big time-series data
    Biem, A.
    Feng, H.
    Riabov, A. V.
    Turaga, D. S.
    [J]. IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2013, 57 (3-4)
  • [10] A New Method for Time-Series Big Data Effective Storage
    Tahmassebpour, Mahmoudreza
    [J]. IEEE ACCESS, 2017, 5 : 10694 - 10699