ECOMETRICS IN THE AGE OF BIG DATA: MEASURING AND ASSESSING "BROKEN WINDOWS" USING LARGE-SCALE ADMINISTRATIVE RECORDS

被引:92
|
作者
O'Brien, Daniel Tumminelli [1 ,2 ,3 ]
Sampson, Robert J. [4 ]
Winship, Christopher [5 ]
机构
[1] Northeastern Univ, Sch Publ Policy & Urban Affairs, Boston, MA 02120 USA
[2] Northeastern Univ, Sch Criminol & Criminal Justice, Boston, MA 02120 USA
[3] Harvard Univ, Radcliffe Inst Adv Study, Boston Area Res Initiat, Cambridge, MA 02138 USA
[4] Harvard Univ, Social Sci, Cambridge, MA 02138 USA
[5] Harvard Univ, John F Kennedy Sch Govt, Cambridge, MA 02138 USA
来源
基金
美国国家科学基金会;
关键词
ecometrics; urban sociology; big data; computational social science; physical disorder; broken windows; 311; hotlines; NEIGHBORHOOD ENVIRONMENTS; DISORDER; SCIENCE; RISK;
D O I
10.1177/0081175015576601
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
The collection of large-scale administrative records in electronic form by many cities provides a new opportunity for the measurement and longitudinal tracking of neighborhood characteristics, but one that will require novel methodologies that convert such data into research-relevant measures. The authors illustrate these challenges by developing measures of "broken windows" from Boston's constituent relationship management (CRM) system (aka 311 hotline). A 16-month archive of the CRM database contains more than 300,000 address-based requests for city services, many of which reference physical incivilities (e.g., graffiti removal). The authors carry out three ecometric analyses, each building on the previous one. Analysis 1 examines the content of the measure, identifying 28 items that constitute two independent constructs, private neglect and public denigration. Analysis 2 assesses the validity of the measure by using investigator-initiated neighborhood audits to examine the "civic response rate" across neighborhoods. Indicators of civic response were then extracted from the CRM database so that measurement adjustments could be automated. These adjustments were calibrated against measures of litter from the objective audits. Analysis 3 examines the reliability of the composite measure of physical disorder at different spatio-temporal windows, finding that census tracts can be measured at two-month intervals and census block groups at six-month intervals. The final measures are highly detailed, can be tracked longitudinally, and are virtually costless. This framework thus provides an example of how new forms of large-scale administrative data can yield ecometric measurement for urban science while illustrating the methodological challenges that must be addressed.
引用
收藏
页码:101 / 147
页数:47
相关论文
共 50 条
  • [21] Measuring inequality in community resilience to natural disasters using large-scale mobility data
    Boyeong Hong
    Bartosz J. Bonczak
    Arpit Gupta
    Constantine E. Kontokosta
    Nature Communications, 12
  • [22] Measuring inequality in community resilience to natural disasters using large-scale mobility data
    Hong, Boyeong
    Bonczak, Bartosz J.
    Gupta, Arpit
    Kontokosta, Constantine E.
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [23] Assessing large-scale digitization using Web analytics
    Lapworth, Emily
    DIGITAL LIBRARY PERSPECTIVES, 2021, 37 (02) : 133 - 150
  • [24] Towards Big Linked Data: A Large-Scale, Distributed Semantic Data Storage
    Hu, Bo
    Carvalho, Nuno
    Matsutsuka, Takahide
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2013, 9 (04) : 19 - 43
  • [25] Big Data, Big Results: Knowledge Discovery in Output from Large-Scale Analytics
    McCormick, Tyler H.
    Ferrell, Rebecca
    Karr, Alan F.
    Ryan, Patrick B.
    STATISTICAL ANALYSIS AND DATA MINING, 2014, 7 (05) : 404 - 412
  • [26] A Grid Status Analysis Method with Large-Scale Wind Power Access Using Big Data
    Liu, Dan
    Kang, Yiqun
    Luo, Heng
    Ji, Xiaotong
    Cao, Kan
    Ma, Hengrui
    ENERGIES, 2023, 16 (12)
  • [27] Big R: Large-scale Analytics on Hadoop using R
    Lara, Oscar D.
    Zhuang, Weiqiang
    Pannu, Adarsh
    2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 569 - 576
  • [28] Inference of Big-Five Personality Using Large-scale Networked Mobile and Appliance Data
    Tong, Catherine
    Harari, Gabriella M.
    Chieh, Angela
    Bellahsen, Otmane
    Vegreville, Matthieu
    Roitmann, Eva
    Lane, Nicholas D.
    MOBISYS'18: PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON MOBILE SYSTEMS, APPLICATIONS, AND SERVICES, 2018, : 530 - 530
  • [29] Machine learning prediction of incidence of Alzheimer's disease using large-scale administrative health data
    Park, Ji Hwan
    Cho, Han Eol
    Kim, Jong Hun
    Wall, Melanie M.
    Stern, Yaakov
    Lim, Hyunsun
    Yoo, Shinjae
    Kim, Hyoung Seop
    Cha, Jiook
    NPJ DIGITAL MEDICINE, 2020, 3 (01)
  • [30] Machine learning prediction of incidence of Alzheimer’s disease using large-scale administrative health data
    Ji Hwan Park
    Han Eol Cho
    Jong Hun Kim
    Melanie M. Wall
    Yaakov Stern
    Hyunsun Lim
    Shinjae Yoo
    Hyoung Seop Kim
    Jiook Cha
    npj Digital Medicine, 3