Finding and understanding pedal misapplication crashes using a deep learning natural language model

被引:4
|
作者
Bareiss, Max [1 ]
Smith, Colin [1 ]
Gabler, Hampton C. [1 ]
机构
[1] Virginia Tech, Dept Biomed Engn, Blacksburg, VA USA
关键词
Pedal misapplication; NMVCCS; deep learning; BERT; NLP;
D O I
10.1080/15389588.2021.1982616
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Objective The objective of this study was to develop a system which used the BERT natural language understanding model to identify pedal misapplication (PM) crashes from their crash narratives and validate the accuracy of the system. Methods The training dataset used for this study was 11 cases from the NMVCCS study and 952 cases from the North Carolina state crash database. Cases for this study were selected from their respective full datasets using a keyword search algorithm containing terms indicative of a pedal-related mistake. A BERT language model was used to classify each case narrative as either no pedal misapplication, PM by vehicle 1, PM by vehicle 2, or PM by vehicle 3. After training, the language model was used to determine the incidence of pedal misapplication in a test dataset of 8,668 North Carolina and NMVCCS cases and these results were compared to a manual review of the dataset. After manual review, 2,969 cases were pedal misapplications. Results The model's AUC ROC performance at detecting PM was quantified on the entire testing dataset to evaluate the power of the system to generalize to case narratives unseen at training time. The AUC ROC value was 0.9835, indicating strong generalization to all crash narratives. By choosing the optimal threshold using the ROC curve, the system correctly identified PM in 95.7% of crash narratives. When pedal misapplication was correctly identified, the correct vehicle was identified in 95.9% of cases. A total of 3,062 pedal misapplications were identified. The model labeled cases 353 times faster than a researcher. Conclusions The strong performance of the model suggests that the automated interpretation of case narratives can be used for future research studies without any manual review. This would save time and enable the use of datasets where manual review would be infeasible. The automated extraction of information from crash narratives using deep learning natural language models has not been demonstrated previously in the literature, to the best of the authors' knowledge. This technique can be applied to large, infrequently used datasets of crash narratives and extended to extract useful vehicle, occupant, or environment information to make these datasets amenable to traditional statistical analyses.
引用
收藏
页码:S169 / S172
页数:4
相关论文
共 50 条
  • [31] Shortcut Learning of Large Language Models in Natural Language Understanding
    Du, Mengnan
    He, Fengxiang
    Zou, Na
    Tao, Dacheng
    Hu, Xia
    COMMUNICATIONS OF THE ACM, 2024, 67 (01) : 110 - 120
  • [32] Intent Detection for Spoken Language Understanding Using a Deep Ensemble Model
    Firdaus, Mauajama
    Bhatnagar, Shobhit
    Ekbal, Asif
    Bhattacharyya, Pushpak
    PRICAI 2018: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2018, 11012 : 629 - 642
  • [33] Application of Deep Belief Networks for Natural Language Understanding
    Sarikaya, Ruhi
    Hinton, Geoffrey E.
    Deoras, Anoop
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (04) : 778 - 784
  • [34] Temporal Relationship Extraction for Natural Language Texts by Using Deep Bidirectional Language Model
    Lim, Chae-Gyun
    Choi, Ho-Jin
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2020), 2020, : 555 - 557
  • [35] A deep learning model for natural language querying in Cyber-Physical Systems
    Llopis, Juan Alberto
    Fernandez-Garcia, Antonio Jesus
    Criado, Javier
    Iribarne, Luis
    Ayala, Rosa
    Wang, James Z.
    INTERNET OF THINGS, 2023, 24
  • [36] Deep Learning Methods in Natural Language Processing
    Flores, Alexis Stalin Alulema
    APPLIED TECHNOLOGIES (ICAT 2019), PT II, 2020, 1194 : 92 - 107
  • [37] Natural Language Interfaces for Databases with Deep Learning
    Katsogiannis-Meimarakis, George
    Xydas, Mike
    Koutrika, Georgia
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (12): : 3878 - 3881
  • [38] Deep Learning on Graphs for Natural Language Processing
    Wu, Lingfei
    Chen, Yu
    Ji, Heng
    Liu, Bang
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 4084 - 4085
  • [39] Deep Learning on Graphs for Natural Language Processing
    Wu, Lingfei
    Chen, Yu
    Ji, Heng
    Liu, Bang
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 2651 - 2653
  • [40] Deep Structured Learning for Natural Language Processing
    Li, Yong
    Yang, Xiaojun
    Zuo, Min
    Jin, Qingyu
    Li, Haisheng
    Cao, Qian
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (03)