Investigating macro-level hotzone identification and variable importance using big data: A random forest models approach

被引:42
|
作者
Jiang, Ximiao [1 ]
Abdel-Aty, Mohamed [2 ]
Hu, Jia [1 ]
Lee, Jaeyoung [2 ]
机构
[1] Fed Highway Adm, Off Operat R&D, Mclean, VA 22101 USA
[2] Univ Cent Florida, Dept Civil Environm & Construct Engn, Orlando, FL 32816 USA
关键词
Hotzone identification; Big data; Connected Vehicle; Variable importance; Random forest; Wilcoxon test; TRAFFIC ACCIDENTS; SPATIAL-ANALYSIS; INJURY SEVERITY; SAFETY ANALYSIS; ROAD CRASHES; LAND-USE; CLASSIFICATION; LEVEL; HETEROGENEITY; COLLISIONS;
D O I
10.1016/j.neucom.2015.08.097
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As Connected Vehicle technologies begin to be deployed along roadway networks, they will be providing massive amount of data. This big data can be useful in identifying safety hazardous zones, which can be complicated and unreliable today. Without sufficient data, past studies had to focus mostly on the micro level networks. Research on macro-level hotzone identification is limited, and until this point, the contribution of various macroscopic features on the macro-level crash risks is still in dispute. This paper, with the help of massive amount of data, investigates the feasibility of using random forest for hotzone identification at macro-level- the Traffic Analysis Zone (TAZ) level. At the same time, the most influential macro-level crash risk determinants were identified by applying a series of random forest models in combination with the cross validation methods. The differences of all features between hotzones and normal TAZs were also recognized through Wilcoxon tests. Crash data of three counties in Florida during 2008 and 2009 were employed. Crash risks by different injury levels and collision types were investigated separately. Finally, the significance of various macroscopic variables was determined by different types of crash risks using variable importance analysis. The research results suggest that the distribution of road network and socio-economics are the two most important factors when proactively alleviating traffic safety issues. For developed urban areas, it is desirable to formulate specific traffic safety management strategies that accounts for zone-level socioeconomics and development of road infrastructure. For zones with a higher percentage of school enrollment, pedestrian and bicycle friendly roadway system design are most beneficial. It is also desirable to take efficient countermeasures such as law enforcement and driving school training to regulate young drivers' behavior in school zones. For areas with high minority residence, there might be a need to use awareness campaigns in multiple languages to relieve pedestrian safety issues. Finally, additional attention should be paid to improve intersection design and management during the planning and operation processes. Published by Elsevier B.V.
引用
收藏
页码:53 / 63
页数:11
相关论文
共 8 条
  • [1] Reactions to macro-level shocks and re-examination of adaptation theory using Big Data
    Greyling, Talita
    Rossouw, Stephanie
    PLOS ONE, 2024, 19 (01):
  • [2] Using a Random Forest proximity measure for variable importance stratification in genotypic data
    Seoane, Jose A.
    Day, Ian N. M.
    Campbell, Colin
    Casas, Juan P.
    Gaunt, Tom R.
    PROCEEDINGS IWBBIO 2014: INTERNATIONAL WORK-CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, VOLS 1 AND 2, 2014, : 1049 - 1060
  • [3] VARIABLE IMPORTANCE AND RANDOM FOREST CLASSIFICATION USING RADARSAT-2 POLSAR DATA
    Hariharan, Siddharth
    Tirodkar, Siddhesh
    De, Shaunak
    Bhattacharya, Avik
    2014 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2014, : 1210 - 1213
  • [4] Development of Macro-Level Crime and Collision Prediction Models to Support Data-Driven Approach to Crime and Traffic Safety (DDACTS)
    Takyi, Emmanuel A.
    Oluwajana, Seun Daniel
    Park, Peter Y.
    TRANSPORTATION RESEARCH RECORD, 2018, 2672 (33) : 56 - 66
  • [5] Random forest models for motorcycle accident prediction using naturalistic driving based big data
    Outay, Fatma
    Adnan, Muhammad
    Gazder, Uneb
    Baqueri, Syed Fazal Abbas
    Awan, Hammad Hussain
    INTERNATIONAL JOURNAL OF INJURY CONTROL AND SAFETY PROMOTION, 2023, 30 (02) : 282 - 293
  • [6] Remote fossil prospecting in the Cradle of Humankind: Assessing variable importance for cave site prediction using Random Forest models
    Furtner, Margaret J.
    Anemone, Robert L.
    Wang, Lei
    Caruana, Matthew V.
    Lombard, Marlize
    Brophy, Juliet K.
    AMERICAN JOURNAL OF BIOLOGICAL ANTHROPOLOGY, 2024, 183 : 55 - 55
  • [7] Development of models for forest variable estimation from airborne laser scanning data using an area-based approach at a plot level
    Sabol J.
    Procházka D.
    Patočka Z.
    Journal of Forest Science, 2016, 62 (03) : 137 - 142
  • [8] PSO-random forest approach to enhance flood-prone area identification: using ground and remote sensing data (case study: Ottawa-Gatineau)
    Maedeh Mosalla Tabari
    Hamid Ebadi
    Zahra Alizadeh Zakaria
    Earth Science Informatics, 2025, 18 (2)