Crowd-sourced Text Analysis: Reproducible and Agile Production of Political Data

被引:126
|
作者
Benoit, Kenneth [1 ,2 ]
Conway, Drew [3 ]
Lauderdale, Benjamin E. [4 ]
Laver, Michael [3 ]
Mikhaylov, Slava [5 ]
机构
[1] London Sch Econ, London, England
[2] Trinity Coll Dublin, Dublin, Ireland
[3] NYU, New York, NY 10003 USA
[4] London Sch Econ & Polit Sci, London, England
[5] UCL, London WC1E 6BT, England
基金
欧洲研究理事会;
关键词
PARTY; RELIABILITY;
D O I
10.1017/S0003055416000058
中图分类号
D0 [政治学、政治理论];
学科分类号
0302 ; 030201 ;
摘要
Empirical social science often relies on data that are not observed in the field, but are transformed into quantitative variables by expert researchers who analyze and interpret qualitative raw sources. While generally considered the most valid way to produce data, this expert-driven process is inherently difficult to replicate or to assess on grounds of reliability. Using crowd-sourcing to distribute text for reading and interpretation by massive numbers of nonexperts, we generate results comparable to those using experts to read and interpret the same texts, but do so far more quickly and flexibly. Crucially, the data we collect can be reproduced and extended transparently, making crowd-sourced datasets intrinsically reproducible. This focuses researchers' attention on the fundamental scientific objective of specifying reliable and replicable methods for collecting the data needed, rather than on the content of any particular dataset. We also show that our approach works straightforwardly with different types of political text, written in different languages. While findings reported here concern text analysis, they have far-reaching implications for expert-generated data in the social sciences.
引用
收藏
页码:278 / 295
页数:18
相关论文
共 50 条
  • [31] Vayu: An Open-Source Toolbox for Visualization and Analysis of Crowd-Sourced Sensor Data
    Mahajan, Sachit
    SENSORS, 2021, 21 (22)
  • [32] Representativeness and Diversity in Photos via Crowd-Sourced Media Analysis
    Radu, Anca-Livia
    Stoettinger, Julian
    Ionescu, Bogdan
    Menendez, Maria
    Giunchiglia, Fausto
    ADAPTIVE MULTIMEDIA RETRIEVAL: SEMANTICS, CONTEXT, AND ADAPTATION, AMR 2012, 2014, 8382 : 116 - 129
  • [33] LSTrAP-Crowd: prediction of novel components of bacterial ribosomes with crowd-sourced analysis of RNA sequencing data
    Hew, Benedict
    Tan, Qiao Wen
    Goh, William
    Ng, Jonathan Wei Xiong
    Mutwil, Marek
    BMC BIOLOGY, 2020, 18 (01)
  • [34] LSTrAP-Crowd: prediction of novel components of bacterial ribosomes with crowd-sourced analysis of RNA sequencing data
    Benedict Hew
    Qiao Wen Tan
    William Goh
    Jonathan Wei Xiong Ng
    Marek Mutwil
    BMC Biology, 18
  • [35] BioGames: A Platform for Crowd-Sourced Biomedical Image Analysis and Telediagnosis
    Mavandadi, Sam
    Feng, Steve
    Yu, Frank
    Dimitrov, Stoyan
    Yu, Richard
    Ozcan, Aydogan
    GAMES FOR HEALTH JOURNAL, 2012, 1 (05) : 373 - 376
  • [36] Autonomous convergence mechanisms for collaborative crowd-sourced data-modeling
    Luebben, Christian
    Pahl, Marc-Oliver
    PROCEEDINGS OF THE IEEE/IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM 2022, 2022,
  • [37] Crowd-sourced trait data can be used to delimit global biomes
    Scheiter, Simon
    Wolf, Sophie
    Kattenborn, Teja
    BIOGEOSCIENCES, 2024, 21 (21) : 4909 - 4926
  • [38] Using Qualitative Spatial Logic for Validating Crowd-Sourced Geospatial Data
    Du, Heshan
    Hai Nguyen
    Alechina, Natasha
    Logan, Brian
    Jackson, Michael
    Goodwin, John
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3948 - 3953
  • [39] A Map Framework Using Crowd-Sourced Data for Indoor Positioning and Navigation
    Graichen, Thomas
    Gruschka, Erik
    Heinkel, Ulrich
    2017 IEEE INTERNATIONAL WORKSHOP ON MEASUREMENT AND NETWORKING (M&N), 2017, : 217 - 222
  • [40] Crowd-Sourced Data Collection for Urban Monitoring via Mobile Sensors
    Longo, Antonella
    Zappatore, Marco
    Bochicchio, Mario
    Navathe, Shamkant B.
    ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2017, 18 (01)