FF-BERT: A BERT-based ensemble for automated classification of web-based text on flash flood events

被引:4
|
作者
Wilkho, Rohan Singh [1 ]
Chang, Shi [2 ]
Gharaibeh, Nasir G. [1 ]
机构
[1] Texas A&M Univ, Zachry Dept Civil & Environm Engn, College Stn, TX 77840 USA
[2] Trimble Inc, Westminster, CO 80021 USA
关键词
Flash flood; Text classification; Multi-label text classification; BERT;
D O I
10.1016/j.aei.2023.102293
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The web is a rich information repository that can be mined to uncover additional data about past flash flood (FF) events, currently missing from existing structured databases. However, this information originates from multiple sources (news articles, government records, and weather records among others) and may cover several topics. Furthermore, these topics may be disproportionately covered on the web. The large size and heterogenous nature of web information render manual review difficult. To address this challenge, we have developed a multi-label text classification model, FF-BERT. FF-BERT is designed to classify FF-related web paragraphs into one or more of seven categories: (1) Damage and Economic Impact (DI), (2) Fatalities, Injuries, and Rescue (FIR), (3) Hydrometeorology (HM), (4) Warning and Emergency (WE), (5) Response and Recovery (RR), (6) Public Health (PH), and (7) Mitigation (MG). To develop FF-BERT, we labeled 21,180 paragraphs from FF-related webpages and performed experiments with multiple model architectures based on the widely used language model Bidirectional Encoder Representation from Transformers (BERT). Our final model outperforms the baseline by 11.83%, as measured by the micro-F1 score. In addition, FF-BERT significantly improves the prediction of minority labels (RR-32.1%, PH-260.4%, and MG-138.6%). We demonstrate using real world examples that FF-BERT can be used to uncover new information about flash flood events. This information can be used to enhance existing databases, such as NOAA's Storm Events Database.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Hierarchical graph-based text classification framework with contextual node embedding and BERT-based dynamic fusion
    Onan, Aytug
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (07)
  • [22] Transformer models for text-based emotion detection: a review of BERT-based approaches
    Francisca Adoma Acheampong
    Henry Nunoo-Mensah
    Wenyu Chen
    Artificial Intelligence Review, 2021, 54 : 5789 - 5829
  • [23] Transformer models for text-based emotion detection: a review of BERT-based approaches
    Acheampong, Francisca Adoma
    Nunoo-Mensah, Henry
    Chen, Wenyu
    ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (08) : 5789 - 5829
  • [24] Assessing the use of attention weights to interpret BERT-based stance classification
    Cordova Saenz, Carlos Abel
    Becker, Karin
    2021 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2021), 2021, : 194 - 201
  • [25] Fault Text Classification of Rotating Machine Based BERT
    Chen Ling
    Liu Yimin
    Ji Lianlian
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 6744 - 6750
  • [26] BERT-based semi-supervised domain adaptation for disastrous classification
    Jing Wang
    Kexin Wang
    Multimedia Systems, 2022, 28 : 2237 - 2246
  • [27] BERT-based ensemble learning for multi-aspect hate speech detection
    Ahmed Cherif Mazari
    Nesrine Boudoukhani
    Abdelhamid Djeffal
    Cluster Computing, 2024, 27 : 325 - 339
  • [28] Enhancing text classification with attention matrices based on BERT
    Yu, Zhiyi
    Li, Hong
    Feng, Jialin
    EXPERT SYSTEMS, 2024, 41 (03)
  • [29] BERT-based ensemble learning for multi-aspect hate speech detection
    Mazari, Ahmed Cherif
    Boudoukhani, Nesrine
    Djeffal, Abdelhamid
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (01): : 325 - 339
  • [30] BERT-Based Ensemble Model for Statute Law Retrieval and Legal Information Entailment
    Shao, Hsuan-Lei
    Chen, Yi-Chia
    Huang, Sieh-Chuen
    NEW FRONTIERS IN ARTIFICIAL INTELLIGENCE, JSAI-ISAI 2020, 2021, 12758 : 226 - 239