DETECTING SPATIAL PATTERNS OF NATURAL HAZARDS FROM THE WIKIPEDIA KNOWLEDGE BASE

被引:1
|
作者
Fan, J. [1 ]
Stewart, K. [1 ]
机构
[1] Univ Iowa, Dept Geog & Sustainabil Sci, Iowa City, IA 52242 USA
关键词
Volunteered Geographic Information; User-Generated Knowledge; Topic Modeling; Big Geospatial Data; Wildfire;
D O I
10.5194/isprsannals-II-4-W2-87-2015
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The Wikipedia database is a data source of immense richness and variety. Included in this database are thousands of geo-tagged articles, including, for example, almost real-time updates on current and historic natural hazards. This includes user-contributed information about the location of natural hazards, the extent of the disasters, and many details relating to response, impact, and recovery. In this research, a computational framework is proposed to detect spatial patterns of natural hazards from the Wikipedia database by combining topic modeling methods with spatial analysis techniques. The computation is performed on the Neon Cluster, a high performance-computing cluster at the University of Iowa. This work uses wildfires as the exemplar hazard, but this framework is easily generalizable to other types of hazards, such as hurricanes or flooding. Latent Dirichlet Allocation ( LDA) modeling is first employed to train the entire English Wikipedia dump, transforming the database dump into a 500-dimension topic model. Over 230,000 geo-tagged articles are then extracted from the Wikipedia database, spatially covering the contiguous United States. The geo-tagged articles are converted into an LDA topic space based on the topic model, with each article being represented as a weighted multi-dimension topic vector. By treating each article's topic vector as an observed point in geographic space, a probability surface is calculated for each of the topics. In this work, Wikipedia articles about wildfires are extracted from the Wikipedia database, forming a wildfire corpus and creating a basis for the topic vector analysis. The spatial distribution of wildfire outbreaks in the US is estimated by calculating the weighted sum of the topic probability surfaces using a map algebra approach, and mapped using GIS. To provide an evaluation of the approach, the estimation is compared to wildfire hazard potential maps created by the USDA Forest service.
引用
收藏
页码:87 / 93
页数:7
相关论文
共 50 条
  • [1] Encyclopedic Knowledge Patterns from Wikipedia Links
    Nuzzolese, Andrea Giovanni
    Gangemi, Aldo
    Presutti, Valentina
    Ciancarini, Paolo
    SEMANTIC WEB - ISWC 2011, PT I, 2011, 7031 : 520 - 536
  • [2] Spatial patterns of natural hazards mortality in the United States
    Kevin A Borden
    Susan L Cutter
    International Journal of Health Geographics, 7
  • [3] Spatial patterns of natural hazards mortality in the United States
    Borden, Kevin A.
    Cutter, Susan L.
    INTERNATIONAL JOURNAL OF HEALTH GEOGRAPHICS, 2008, 7 (1)
  • [4] Construction of Encyclopedic Knowledge Base from Infobox of Indonesian Wikipedia
    Wahyudi
    Khodra, Masayu Leylia
    Wibisono, Yudi
    2018 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY SYSTEMS AND INNOVATION (ICITSI), 2018, : 542 - 546
  • [5] YAGO: A Multilingual Knowledge Base from Wikipedia, Wordnet, and Geonames
    Rebele, Thomas
    Suchanek, Fabian
    Hoffart, Johannes
    Biega, Joanna
    Kuzey, Erdal
    Weikum, Gerhard
    SEMANTIC WEB - ISWC 2016, PT II, 2016, 9982 : 177 - 185
  • [6] Building Chinese field association knowledge base from Wikipedia
    Wang, Li
    Yao, Min
    Zhang, Yuanpeng
    Qian, Danmin
    Geng, Xinyun
    Jiang, Kui
    Dong, Jiancheng
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2015, 52 (2-3) : 168 - 176
  • [7] Populating ConceptNet knowledge base with Information Acquired from Japanese Wikipedia
    Krawczyk, Marek
    Rzepka, Rafal
    Araki, Kenji
    2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 2985 - 2989
  • [8] YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia
    Hoffart, Johannes
    Suchanek, Fabian M.
    Berberich, Klaus
    Weikum, Gerhard
    ARTIFICIAL INTELLIGENCE, 2013, 194 : 28 - 61
  • [9] DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia
    Lehmann, Jens
    Isele, Robert
    Jakob, Max
    Jentzsch, Anja
    Kontokostas, Dimitris
    Mendes, Pablo N.
    Hellmann, Sebastian
    Morsey, Mohamed
    van Kleef, Patrick
    Auer, Soeren
    Bizer, Christian
    SEMANTIC WEB, 2015, 6 (02) : 167 - 195
  • [10] Constructing Semantic Knowledge Base based on Wikipedia automation
    Niu, Wanpeng
    Chen, Junting
    Chen, Meilin
    PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON MATERIALS ENGINEERING AND INFORMATION TECHNOLOGY APPLICATIONS (MEITA 2016), 2017, 107 : 202 - 209