FF-BERT: A BERT-based ensemble for automated classification of web-based text on flash flood events

被引:4
|
作者
Wilkho, Rohan Singh [1 ]
Chang, Shi [2 ]
Gharaibeh, Nasir G. [1 ]
机构
[1] Texas A&M Univ, Zachry Dept Civil & Environm Engn, College Stn, TX 77840 USA
[2] Trimble Inc, Westminster, CO 80021 USA
关键词
Flash flood; Text classification; Multi-label text classification; BERT;
D O I
10.1016/j.aei.2023.102293
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The web is a rich information repository that can be mined to uncover additional data about past flash flood (FF) events, currently missing from existing structured databases. However, this information originates from multiple sources (news articles, government records, and weather records among others) and may cover several topics. Furthermore, these topics may be disproportionately covered on the web. The large size and heterogenous nature of web information render manual review difficult. To address this challenge, we have developed a multi-label text classification model, FF-BERT. FF-BERT is designed to classify FF-related web paragraphs into one or more of seven categories: (1) Damage and Economic Impact (DI), (2) Fatalities, Injuries, and Rescue (FIR), (3) Hydrometeorology (HM), (4) Warning and Emergency (WE), (5) Response and Recovery (RR), (6) Public Health (PH), and (7) Mitigation (MG). To develop FF-BERT, we labeled 21,180 paragraphs from FF-related webpages and performed experiments with multiple model architectures based on the widely used language model Bidirectional Encoder Representation from Transformers (BERT). Our final model outperforms the baseline by 11.83%, as measured by the micro-F1 score. In addition, FF-BERT significantly improves the prediction of minority labels (RR-32.1%, PH-260.4%, and MG-138.6%). We demonstrate using real world examples that FF-BERT can be used to uncover new information about flash flood events. This information can be used to enhance existing databases, such as NOAA's Storm Events Database.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] BAE: BERT-based Adversarial Examples for Text Classification
    Garg, Siddhant
    Ramakrishnan, Goutham
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6174 - 6181
  • [2] Improving BERT-Based Text Classification With Auxiliary Sentence and Domain Knowledge
    Yu, Shanshan
    Su, Jindian
    Luo, Da
    IEEE ACCESS, 2019, 7 : 176600 - 176612
  • [3] Improving Bert-Based Model for Medical Text Classification with an Optimization Algorithm
    Gasmi, Karim
    ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2022, 2022, 1653 : 101 - 111
  • [4] Short-Text Classification Detector: A Bert-Based Mental Approach
    Hu, Yongjun
    Ding, Jia
    Dou, Zixin
    Chang, Huiyou
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [5] BERT-Based GitHub Issue Report Classification
    Siddiq, Mohammed Latif
    Santos, Joanna C. S.
    2022 IEEE/ACM 1ST INTERNATIONAL WORKSHOP ON NATURAL LANGUAGE-BASED SOFTWARE ENGINEERING (NLBSE 2022), 2022, : 33 - 36
  • [6] A Study of BERT-Based Classification Performance of Text-Based Health Counseling Data
    Sung, Yeol Woo
    Park, Dae Seung
    Kim, Cheong Ghil
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2023, 135 (01): : 795 - 808
  • [7] BERT-based Ensemble Approaches for Hate Speech Detection
    Mnassri, Khouloud
    Rajapaksha, Praboda
    Farahbakhsh, Reza
    Crespi, Noel
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 4649 - 4654
  • [8] Three-Branch BERT-Based Text Classification Network for Gastroscopy Diagnosis Text
    Wang Z.
    Zheng X.
    Zhang J.
    Zhang M.
    International Journal of Crowd Science, 2024, 8 (01) : 56 - 63
  • [9] A Fine-Tuned BERT-Based Transfer Learning Approach for Text Classification
    Qasim, Rukhma
    Bangyal, Waqas Haider
    Alqarni, Mohammed A.
    Almazroi, Abdulwahab Ali
    JOURNAL OF HEALTHCARE ENGINEERING, 2022, 2022
  • [10] BERT-based chinese text classification for emergency management with a novel loss function
    Zhongju Wang
    Long Wang
    Chao Huang
    Shutong Sun
    Xiong Luo
    Applied Intelligence, 2023, 53 : 10417 - 10428