Mining Spatio-temporal Data on Industrialization from Historical Registries

被引:11
|
作者
Berenbaum, D. [1 ]
Deighan, D. [1 ]
Marlow, T. [2 ]
Lee, A. [1 ]
Frickel, S. [2 ]
Howison, M. [1 ]
机构
[1] Brown Univ, Comp & Informat Serv, Data Sci Practice, 3 Davol Sq, Providence, RI 02912 USA
[2] Brown Univ, Inst Brown Environm & Soc, 80 Waterman St, Providence, RI 02912 USA
关键词
structured text; historical data; geocoding; page layout analysis; socio-environmental analysis; LAND-USE CONVERSIONS; WASTE; URBANIZATION; CLUSTERS;
D O I
10.3808/jei.201700381
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Despite the growing availability of big data in many fields, historical data on socio-evironmental phenomena are often not available due to a lack of automated and scalable approaches for collecting, digitizing, and assembling them. We have developed a data-mining method for extracting tabulated, geocoded data from printed directories. While scanning and optical character recognition (OCR) can digitize printed text, these methods alone do not capture the structure of the underlying data. Our pipeline integrates both page layout analysis and OCR to extract tabular, geocoded data from structured text. We demonstrate the utility of this method by applying it to scanned manufacturing registries from Rhode Island that record 41 years of industrial land use. The resulting spatio-temporal data can be used for socio-environmental analyses of industrialization at a resolution that was not previously possible. In particular, we find strong evidence for the dispersion of manufacturing from the urban core of Providence, the state's capital, along the Interstate 95 corridor to the north and south.
引用
收藏
页码:28 / 34
页数:7
相关论文
共 50 条
  • [1] Mining spatio-temporal data
    Gennady Andrienko
    Donato Malerba
    Michael May
    Maguelonne Teisseire
    [J]. Journal of Intelligent Information Systems, 2006, 27 : 187 - 190
  • [2] Mining spatio-temporal data
    Andrienko, Gennady
    Malerba, Donato
    May, Michael
    Teisseire, Maguelonne
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2006, 27 (03) : 187 - 190
  • [3] A survey on spatio-temporal data mining
    Vasavi, M.
    Murugan, A.
    [J]. Materials Today: Proceedings, 2023, 80 : 2769 - 2772
  • [4] Fuzzy association rule mining from spatio-temporal data
    Calargun, Seda Unal
    Yazici, Adnan
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2008, PT 1, PROCEEDINGS, 2008, 5072 : 631 - 646
  • [5] Mining Location Information from Users' Spatio-temporal Data
    Jenson, Sage
    Reeves, Majerle
    Tomasini, Marcello
    Menezes, Ronaldo
    [J]. 2017 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTED, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2017,
  • [6] Exploratory spatio-temporal data mining and visualization
    Compieta, P.
    Di Martino, S.
    Bertolotto, M.
    Ferrucci, F.
    Kechadi, T.
    [J]. JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2007, 18 (03): : 255 - 279
  • [7] A new approach for spatio-temporal data mining
    Cassat, Sabine
    Irani, Pourang
    Serrano, Marcos
    Dubois, Emmanuel
    [J]. ACTES DE LA 30 CONFERENCE FRANCOPHONE SUR L'INTERACTION HOMME-MACHINE - (IHM 2018), 2018, : 163 - 169
  • [8] A visual approach for spatio-temporal data mining
    Kechadi, M-Tahar
    Bertolotto, Michela
    [J]. IRI 2006: PROCEEDINGS OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2006, : 504 - +
  • [9] Mining Spatio-Temporal Patterns in Trajectory Data
    Kang, Juyoung
    Yong, Hwan-Seung
    [J]. JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2010, 6 (04): : 521 - 536
  • [10] Indexing Historical Spatio-Temporal Data in the Cloud
    Zhang, Chong
    Chen, Xiaoying
    Ge, Bin
    Xiao, Weidong
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 1765 - 1774