A lightweight and multilingual framework for crisis information extraction from Twitter data

被引:13
|
作者
Interdonato, Roberto [1 ]
Guillaume, Jean-Loup [2 ]
Doucet, Antoine [2 ]
机构
[1] Univ Montpellier, CIRAD, TETIS, APT,CNRS,Irstea, Montpellier, France
[2] Univ La Rochelle, L3I, La Rochelle, France
关键词
Crisis management; Situational awareness; Informativeness ranking; QUALITY;
D O I
10.1007/s13278-019-0608-4
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Obtaining relevant timely information during crisis events is a challenging task that can be fundamental to handle the consequences deriving from both unexpected events (e.g., terrorist attacks) and partially predictable ones (i.e., natural disasters). Even though microblogging-based online social networks (e.g., Twitter) have become an attractive data source in these emergency situations, overcoming the information overload deriving from mass events is not trivial. The aim of this work was to enable unsupervised extraction of relevant information from Twitter data during a crisis event, offering a lightweight alternative to learning-based approaches. The proposed lightweight crisis management framework integrates natural language processing and clustering techniques in order to produce a ranking of tweets relevant to a crisis situation based on their informativeness. Experiments carried out on six Twitter collections in two languages (English and French) proved the significance and the flexibility of our approach.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Multilingual Open Information Extraction: Challenges and Opportunities
    Claro, Daniela Barreiro
    Souza, Marlo
    Xavier, Clarissa Castella
    Oliveira, Leandro
    [J]. INFORMATION, 2019, 10 (07)
  • [22] Multilingual open information extraction: Challenges and opportunities
    Claro, Daniela Barreiro
    Souza, Marlo
    Xavier, Clarissa Castellã
    Oliveira, Leandro
    [J]. Information (Switzerland), 2019, 10 (07):
  • [23] A framework for multilingual electronic data interchange
    Maani, R
    Parsa, S
    [J]. E-COMMERCE AND WEB TECHNOLOGIES, 2004, 3182 : 196 - 205
  • [24] A Multilingual Information Extraction Pipeline for Investigative Journalism
    Wiedemann, Gregor
    Yimam, Seid Muhie
    Biemann, Chris
    [J]. CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, 2018, : 78 - 83
  • [25] Raimond: Quantitative Data Extraction from Twitter to Describe Events
    Sellam, Thibault
    Alonso, Omar
    [J]. ENGINEERING THE WEB IN THE BIG DATA ERA, 2015, 9114 : 251 - 268
  • [26] Optimal Path Finding based on Traffic Information Extraction from Twitter
    Hasby, Muhammad
    Khodra, Masayu Leylia
    [J]. 2013 INTERNATIONAL CONFERENCE ON ICT FOR SMART SOCIETY (ICISS): THINK ECOSYSTEM ACT CONVERGENCE, 2013, : 120 - 124
  • [27] LinguaKit: a Big Data-based multilingual tool for linguistic analysis and information extraction
    Gamallo, Pablo
    Garcia, Marcos
    Pineiro, Cesar
    Martinez-Castano, Rodrigo
    Pichel, Juan C.
    [J]. 2018 FIFTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORKS ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), 2018, : 239 - 244
  • [28] Multilingual information retrieval in the language modeling framework
    Rahimi, Razieh
    Shakery, Azadeh
    King, Irwin
    [J]. INFORMATION RETRIEVAL JOURNAL, 2015, 18 (03): : 246 - 281
  • [29] Multilingual information retrieval in the language modeling framework
    Razieh Rahimi
    Azadeh Shakery
    Irwin King
    [J]. Information Retrieval Journal, 2015, 18 : 246 - 281
  • [30] A framework with efficient extraction and analysis of Twitter data for evaluating public opinions on transportation services
    Qi, Bing
    Costin, Aaron
    Jia, Mengda
    [J]. TRAVEL BEHAVIOUR AND SOCIETY, 2020, 21 : 10 - 23