An evaluation of various data pre-processing techniques with machine learning models for water level prediction

被引:0
|
作者
Ervin Shan Khai Tiu
Yuk Feng Huang
Jing Lin Ng
Nouar AlDahoul
Ali Najah Ahmed
Ahmed Elshafie
机构
[1] Universiti Tunku Abdul Rahman,Department of Civil Engineering, Lee Kong Chian Faculty of Engineering and Science
[2] UCSI University,Department of Civil Engineering, Faculty of Engineering, Technology and Built Environment
[3] Multimedia University,Faculty of Engineering
[4] University Tenaga Nasional (UNITEN),Institute of Energy Infrastructure (IEI), Department of Civil Engineering, College of Engineering
[5] University of Malaya,Department of Civil Engineering, Faculty of Engineering
来源
Natural Hazards | 2022年 / 110卷
关键词
Artificial neural network; Bagging; Boosting; River water level prediction; Support vector regression; Variational Mode Decomposition;
D O I
暂无
中图分类号
学科分类号
摘要
Floods are the most frequent type of natural disaster. It destroys wildlife habitat, damages bridges, railways, roads, properties, and puts millions of people at risk. As such, flood detection systems have been developed to monitor the changes of water level and raise an alarm should there be imminent danger. River water level prediction is a significant task in flood mitigation planning and floodplains management. Usually, using raw data of rainfall series directly with machine learning (ML) regression methods, does not result in sufficiently good prediction accuracy. The raw data should be pre-processed using specific techniques to enhance their quality a priori to being applied to the prediction methods. This paper serves to address the stated problem by utilizing various data pre-processing techniques such as the Variational Mode Decomposition (VMD), Bagging, Boosting, Bagging-VMD, and Boosting-VMD to enhance the quality of input data and thus culminating in improved model accuracy. The five proposed pre-processing techniques were applied to the observed daily rainfall series of the Dungun river basin, Malaysia, for the period starting from November to February (Northeast Monsoon) from 1996 to 2016. Two machine learning models, the base models (Ori), that is the artificial neural network (ANN) and the support vector regression (SVR), were used in conjunction with the data pre-processing methods. The comparison between the ML methods with and without data pre-processing was done. It was found that prediction of water levels with the two ML methods of SVR and ANN together with the Boosting-VMD was superior to those results derived with just the base original model (Ori). The advantage of the enhanced models (respectively, founded on SVR and ANN) over the original models (SVR and ANN) is best reflected in the performance statistics. Numerical results in terms of root mean square error (RMSE) of (0.42, 0.20 vs 1.85,1.82), mean absolute percentage error (MAPE) of (4.36, 2.82 vs 18.89, 22.56), mean absolute error (MAE) of (0.28,0.16 vs 1.25, 1.41), and Nash–Sutcliffe efficiency coefficient (NSE) (0.96, 0.99 vs 0.25, 0.27) were obtained for the respective models. Additionally, various data visualization graphs such as hydrographs, residual hydrographs, peak-estimates, and box and whisker plots were illustrated to compare between various data pre-processing techniques. The experimental results showed that both the Boosting and the Boosting-VMD methods showed better performance over the other techniques. The Boosting-ANN model was found to be the better model to predict river water levels with the lowest RMSE (0.19), MAPE (2.72), and MAE (0.15) and the highest NSE (0.99).
引用
收藏
页码:121 / 153
页数:32
相关论文
共 50 条
  • [21] A Review on Pre-processing Methods for Fairness in Machine Learning
    Zhang, Zhe
    Wang, Shenhang
    Meng, Gong
    ADVANCES IN NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, ICNC-FSKD 2022, 2023, 153 : 1185 - 1191
  • [22] Evaluation of pre-processing methods for the prediction of cattle behaviour from accelerometer data
    Riaboff, L.
    Aubin, S.
    Bedere, N.
    Couvreur, S.
    Madouasse, A.
    Goumand, E.
    Chauvin, A.
    Plantier, G.
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2019, 165
  • [23] Visualization Techniques on the Examination Timetabling Pre-processing Data
    Thomas, J. Joshua
    Khader, Ahamad Tajudin
    Belaton, Bahari
    PROCEEDINGS OF THE 2009 SIXTH INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS, IMAGING AND VISUALIZATION, 2009, : 454 - 458
  • [24] Data Pre-processing Techniques for Publication Performance Analysis
    Zulkepli, Fatin Shahirah
    Ibrahin, Roliana
    Saeed, Faisal
    RECENT TRENDS IN INFORMATION AND COMMUNICATION TECHNOLOGY, 2018, 5 : 59 - 65
  • [25] Prediction of Water Level Using Machine Learning and Deep Learning Techniques
    Ayus, Ishan
    Natarajan, Narayanan
    Gupta, Deepak
    IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF CIVIL ENGINEERING, 2023, 47 (04) : 2437 - 2447
  • [26] Prediction of Water Level Using Machine Learning and Deep Learning Techniques
    Ishan Ayus
    Narayanan Natarajan
    Deepak Gupta
    Iranian Journal of Science and Technology, Transactions of Civil Engineering, 2023, 47 : 2437 - 2447
  • [27] Survey of Pre-processing Techniques for Mining Big Data
    Hariharakrishnan, Jayaram
    Mohanavalli, S.
    Srividya
    Kumar, Sundhara K. B.
    2017 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND SIGNAL PROCESSING (ICCCSP), 2017, : 77 - 81
  • [28] The effectiveness of data pre-processing methods on the performance of machine learning techniques using RF, SVR, Cubist and SGB: a study on undrained shear strength prediction
    Demir, Selcuk
    Sahin, Emrehan Kutlug
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2024, 38 (08) : 3273 - 3290
  • [29] Developing a Generic Predictive Computational Model using Semantic data Pre-Processing with Machine Learning Techniques and its application for Stock Market Prediction Purposes
    Yerashenia, Natalia
    Bolotov, Alexander
    Fee, David Chan You
    2022 IEEE 24TH CONFERENCE ON BUSINESS INFORMATICS (CBI 2022), VOL 1, 2022, : 50 - 59
  • [30] Data Pre-Processing by Genetic Algorithms for Bankruptcy Prediction
    Tsai, Chih-Fong
    Chou, Jui-Sheng
    2011 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT (IEEM), 2011, : 1780 - 1783