Automatic Quality Control of Transportation Reports Using Statistical Language Processing

被引:3
|
作者
Gerber, Matthew S. [1 ]
Tang, Lu [2 ]
机构
[1] Univ Virginia, Dept Syst & Informat Engn, Charlottesville, VA 22904 USA
[2] Univ Virginia, Dept Stat, Charlottesville, VA 22904 USA
关键词
Natural language processing (NLP); quality control; transportation reports; SEARCH;
D O I
10.1109/TITS.2013.2265892
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
The processes of developing, monitoring, and maintaining transportation systems produce large volumes of information. Human fieldworkers are often responsible for gathering this information, and despite their best efforts, they will inevitably introduce errors into the collected data. This is a critical problem since: 1) the collected data are used to justify key infrastructure maintenance and development decisions; and 2) the volume of unstructured information (e. g., plain text) makes manual quality control prohibitively expensive. We introduce a solution to this problem in the example domain of vehicle accident reports. First, we analyzed a sample of accident reports and confirmed the existence of many data entry errors. Second, we developed and evaluated a statistical language processing approach that automatically identifies reports containing data entry errors. We tested a variety of system configurations on real-world data and compared their performance with multiple baseline methods. The best configuration achieved a performance score of 84%, far outperforming the baseline methods. Our results and analyses have quality control implications for any data source that pairs structured text (e. g., coded fields) with unstructured text.
引用
收藏
页码:1681 / 1689
页数:9
相关论文
共 50 条
  • [41] Sar: Automatic generation of statistical reports using Stata and Microsoft Word for Windows
    Lo Magno, Giovanni L.
    STATA JOURNAL, 2013, 13 (01): : 39 - 64
  • [42] Natural language processing approach for appraisal of passenger satisfaction and service quality of public transportation
    Liu, Yingpei
    Li, Ye
    Li, Wenxiang
    IET INTELLIGENT TRANSPORT SYSTEMS, 2019, 13 (11) : 1701 - 1707
  • [43] Semi-Automated Topic Identification for Radiation Oncology Safety Event Reports Using Natural Language Processing and Statistical Models
    Zhang, Q. S.
    Kang, J.
    Lybarger, K.
    Glenn, M.
    Sponseller, P. A.
    Blau, K. H.
    Ford, E. C.
    INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2024, 120 (02): : E668 - E668
  • [44] Automatic statistical processing of cardiological information
    Kursk State Medical University
    Biomed. Eng., 6 (329-332):
  • [45] Automated interpretation of stress echocardiography reports using natural language processing
    Zheng, Chengyi
    Sun, Benjamin C.
    Wu, Yi-Lin
    Ferencik, Maros
    Lee, Ming-Sum
    Redberg, Rita F.
    Kawatkar, Aniket A.
    Musigdilok, Visanee V.
    Sharp, Adam L.
    EUROPEAN HEART JOURNAL - DIGITAL HEALTH, 2022, 3 (04): : 626 - 637
  • [46] Facilitating cancer research using natural language processing of pathology reports
    Xu, H
    Anderson, K
    Grann, VR
    Friedman, C
    MEDINFO 2004: PROCEEDINGS OF THE 11TH WORLD CONGRESS ON MEDICAL INFORMATICS, PT 1 AND 2, 2004, 107 : 565 - 569
  • [47] Identification of gallstones from radiology reports using natural language processing
    Fairfield, Cameron
    Ots, Riinu
    Antai, Roseline
    Drake, Tom
    Knight, Stephen
    Wigmore, Stephen
    Harrison, Ewen
    BRITISH JOURNAL OF SURGERY, 2018, 105 : 58 - 58
  • [48] Analysis of Breakdown Reports Using Natural Language Processing and Machine Learning
    Ahmed, Mobyen Uddin
    Bengtsson, Marcus
    Salonen, Antti
    Funk, Peter
    INTERNATIONAL CONGRESS AND WORKSHOP ON INDUSTRIAL AI 2021, 2022, : 40 - 52
  • [49] Automatic Control of Road Traffic using Video Processing
    Ashwin, S.
    Hiremath, Sanket S.
    Vasist, Akshay R.
    Lakshmi, H. R.
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON SMART TECHNOLOGIES FOR SMART NATION (SMARTTECHCON), 2017, : 1580 - 1584
  • [50] AUTOMATIC SPEED CONTROL OF VEHICLE USING VIDEO PROCESSING
    Thombare, R. D.
    Sawant, P. M.
    Sawant, P. P.
    Sawant, P. A.
    Naik, V. P.
    PROCEEDINGS OF THE 2018 SECOND INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), 2018, : 919 - 924