DETECTING SPATIAL PATTERNS OF NATURAL HAZARDS FROM THE WIKIPEDIA KNOWLEDGE BASE

被引:1
|
作者
Fan, J. [1 ]
Stewart, K. [1 ]
机构
[1] Univ Iowa, Dept Geog & Sustainabil Sci, Iowa City, IA 52242 USA
关键词
Volunteered Geographic Information; User-Generated Knowledge; Topic Modeling; Big Geospatial Data; Wildfire;
D O I
10.5194/isprsannals-II-4-W2-87-2015
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The Wikipedia database is a data source of immense richness and variety. Included in this database are thousands of geo-tagged articles, including, for example, almost real-time updates on current and historic natural hazards. This includes user-contributed information about the location of natural hazards, the extent of the disasters, and many details relating to response, impact, and recovery. In this research, a computational framework is proposed to detect spatial patterns of natural hazards from the Wikipedia database by combining topic modeling methods with spatial analysis techniques. The computation is performed on the Neon Cluster, a high performance-computing cluster at the University of Iowa. This work uses wildfires as the exemplar hazard, but this framework is easily generalizable to other types of hazards, such as hurricanes or flooding. Latent Dirichlet Allocation ( LDA) modeling is first employed to train the entire English Wikipedia dump, transforming the database dump into a 500-dimension topic model. Over 230,000 geo-tagged articles are then extracted from the Wikipedia database, spatially covering the contiguous United States. The geo-tagged articles are converted into an LDA topic space based on the topic model, with each article being represented as a weighted multi-dimension topic vector. By treating each article's topic vector as an observed point in geographic space, a probability surface is calculated for each of the topics. In this work, Wikipedia articles about wildfires are extracted from the Wikipedia database, forming a wildfire corpus and creating a basis for the topic vector analysis. The spatial distribution of wildfire outbreaks in the US is estimated by calculating the weighted sum of the topic probability surfaces using a map algebra approach, and mapped using GIS. To provide an evaluation of the approach, the estimation is compared to wildfire hazard potential maps created by the USDA Forest service.
引用
收藏
页码:87 / 93
页数:7
相关论文
共 50 条
  • [31] Is it Possible to Enhance our Expert Knowledge from Wikipedia?
    Rechenberg, U.
    Josten, C.
    Klima, S.
    ZEITSCHRIFT FUR ORTHOPADIE UND UNFALLCHIRURGIE, 2015, 153 (02): : 171 - 176
  • [32] Knowledge derived from Wikipedia for computing semantic relatedness
    Ponzetto, Simone Paolo
    Strube, Michael
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2007, 30 (181-212): : 181 - 212
  • [33] A Social History of Knowledge, ii: From the Encyclopedie to Wikipedia
    Agnew, John
    HISTORY, 2013, 98 (330) : 251 - 252
  • [34] Extracting Lexical Semantic Knowledge from Wikipedia and Wiktionary
    Zesch, Torsten
    Mueller, Christof
    Gurevych, Iryna
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 1646 - 1652
  • [35] A Social History of Knowledge II: From the 'Encyclopedie' to Wikipedia
    Burrows, Toby
    PARERGON, 2013, 30 (02) : 240 - 241
  • [36] Detecting temporal and spatial malaria patterns from first antenatal care visits
    Pujol, Arnau
    Brokhattingen, Nanna
    Matambisso, Gloria
    Mbeve, Henriques
    Cistero, Pau
    Escoda, Anna
    Maculuve, Sonia
    Cuna, Boaventura
    Melembe, Cardoso
    Ndimande, Nelo
    Munguambe, Humberto
    Montana, Julia
    Nhamussua, Lidia
    Simone, Wilson
    Tetteh, Kevin K. A.
    Drakeley, Chris
    Gamain, Benoit
    Chitnis, Chetan E.
    Chauhan, Virander
    Quinto, Llorenc
    Chidimatembue, Arlindo
    Marti-Soler, Helena
    Galatas, Beatriz
    Guinovart, Caterina
    Saute, Francisco
    Aide, Pedro
    Macete, Eusebio
    Mayor, Alfredo
    NATURE COMMUNICATIONS, 2023, 14 (01)
  • [37] Infusing Knowledge from Wikipedia to Enhance Stance Detection
    He, Zihao
    Mokhberian, Negar
    Lerman, Kristina
    PROCEEDINGS OF THE 12TH WORKSHOP ON COMPUTATIONAL APPROACHES TO SUBJECTIVITY, SENTIMENT & SOCIAL MEDIA ANALYSIS, 2022, : 71 - 77
  • [38] Detecting temporal and spatial malaria patterns from first antenatal care visits
    Arnau Pujol
    Nanna Brokhattingen
    Glória Matambisso
    Henriques Mbeve
    Pau Cisteró
    Anna Escoda
    Sónia Maculuve
    Boaventura Cuna
    Cardoso Melembe
    Nelo Ndimande
    Humberto Munguambe
    Júlia Montaña
    Lídia Nhamússua
    Wilson Simone
    Kevin K. A. Tetteh
    Chris Drakeley
    Benoit Gamain
    Chetan E. Chitnis
    Virander Chauhan
    Llorenç Quintó
    Arlindo Chidimatembue
    Helena Martí-Soler
    Beatriz Galatas
    Caterina Guinovart
    Francisco Saúte
    Pedro Aide
    Eusébio Macete
    Alfredo Mayor
    Nature Communications, 14
  • [39] Accelerating the update of knowledge base instances by detecting vital information from a document stream
    Abbes, Rafik
    Hernandez, Nathalie
    Pinel-Sauvagnat, Karen
    Boughanem, Mohand
    2015 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT), VOL 1, 2015, : 173 - 176
  • [40] Learning to Map Natural Language Statements into Knowledge Base Representations for Knowledge Base Construction
    Lin, Chin-Ho
    Huang, Hen-Hsen
    Chen, Hsin-Hsi
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3433 - 3437