Prediction of PM2.5 Concentration Based on Ensemble Learning

被引:0
|
作者
Peng Y. [1 ]
Zhao Z.-R. [1 ]
Wu T.-X. [1 ]
Wang J. [1 ]
机构
[1] School of Management, Capital Normal University, Beijing
关键词
Analysis of influencing factors; Gradient boosting decision tree; Integrated feature selection; PM[!sub]2.5[!/sub] prediction model;
D O I
10.13190/j.jbupt.2019-153
中图分类号
学科分类号
摘要
The increase of PM2.5 is a cause of haze. Effectively predicting PM2.5 concentration and analyzing its influence factors play an important role in air quality forecasting and controlling. Considering nonlinearity and uncertainty of PM2.5 concentration, a PM2.5 concentration prediction model which firstly selects features using integrated trees was presented based on ensemble trees-gradient boosting decision tree(GBDT). With standard arithmetic mean aggregation method, the article calculates the influence degree of each feature on the increment of PM2.5 concentration, and provides the impact ranking from strong to weak. The grid-search to select the optimal parameters of the GBDT algorithm was used, such as the depth of the tree. Two datasets, the pollutant concentration data and meteorological observation data of Beijing from 2015 to 2016, are used in the prediction model proposed. Compared with standard models such as decision tree, random forest and support vector machine, the ensemble trees-GBDT model is found to be lower mean absolute errors, lower root mean square errors and better generalization ability. © 2019, Editorial Department of Journal of Beijing University of Posts and Telecommunications. All right reserved.
引用
收藏
页码:162 / 169
页数:7
相关论文
共 15 条
  • [1] Zhang Q., Rao C., Correlation analysis between PM<sub>2.5</sub> and PM<sub>10</sub> ratio in typical regional cities, Journal of Green Science and Technology, 12, pp. 129-130, (2019)
  • [2] Liu X., Wang H., An analysis of vehicle-related PM<sub>2.5</sub> emissions: The perspective from China and Europe, Acta Scientiae Circumstantiae, 39, 8, pp. 2830-2838, (2019)
  • [3] Li J., Liu X., Liu J., Et al., Prediction of PM<sub>2.5</sub> concentration based on MRMR-HK-SVM model, China Environmental Science, 39, 6, pp. 2304-2310, (2019)
  • [4] Wang P., Zhang H., Qin Z., Et al., PM<sub>10</sub> concentration forecasting model based on wavelet-SVM, Environmental Science, 38, 8, pp. 3153-3161, (2017)
  • [5] Ren C., Xie G., Prediction of PM<sub>2.5</sub> concentration level based on random forest and meteorological parameters, Computer Engineering and Applications, 55, 2, pp. 213-220, (2019)
  • [6] Huang J., Zhang F., Du Z., Et al., Hourly concentration prediction of PM<sub>2.5</sub> based on RNN-CNN ensemble deep learning model, Journal of Zhejiang University(Science Edition), 46, 3, pp. 370-379, (2019)
  • [7] Zhang L., Yuan Y., Wang C., FCBF feature selection algorithm based on maximum information coefficient, Journal of Beijing University of Posts and Telecommunications, 41, 4, pp. 86-90, (2018)
  • [8] Cui H., Xu S., Zhang L., Et al., The keytechniques and future vision of feature selection in machine learning, Journal of Beijing University of Posts and Telecommunications, 41, 1, pp. 1-12, (2018)
  • [9] Dietterich T.G., Machine learning research: four current directions, AI Magazine, 18, 4, pp. 97-136, (1997)
  • [10] Liu Y., Chen B., Zhou Z., An improved feature selection algorithm based on random forest, Modern Electronics Technique, 42, 12, pp. 117-121, (2019)