Sweet tweets! Evaluating a new approach for probability-based sampling of Twitter

被引:3
|
作者
Buskirk, Trent D. [1 ]
Blakely, Brian P. [1 ]
Eck, Adam [1 ]
McGrath, Richard [1 ]
Singh, Ravinder [1 ]
Yu, Youzhi [1 ]
机构
[1] Bowling Green State Univ, Bowling Green, OH 43403 USA
关键词
Twitter; Probability sampling; Tweets; Social media; COVID-19; Big data; Survey research;
D O I
10.1140/epjds/s13688-022-00321-1
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
As survey costs continue to rise and response rates decline, researchers are seeking more cost-effective ways to collect, analyze and process social and public opinion data. These issues have created an opportunity and interest in expanding the fit-for-purpose paradigm to include alternate sources such as passively collected sensor data and social media data. However, methods for accessing, sourcing and sampling social media data are just now being developed. In fact, there has been a small but growing body of literature focusing on comparing different Twitter data access methods through either the elaborate firehose or the free Twitter search or streaming APIs. Missing from the literature is a good understanding of how to randomly sample Tweets to produce datasets that are representative of the daily discourse, especially within geographical regions of interest, without requiring a census of all Tweets. This understanding is necessary for producing quality estimates of public opinion from social media sources such as Twitter. To address this gap, we propose and test the Velocity-Based Estimation for Sampling Tweets (VBEST) algorithm for selecting a probability based sample of tweets. We compare the performance of VBEST sample estimates to other methods of accessing Twitter through the Search API on the distribution of total Tweets as well as COVID-19 keyword incidence and frequency and find that the VBEST samples produce consistent and relatively low levels of overall bias compared to common methods of access through the Search API across many experimental conditions.
引用
收藏
页数:32
相关论文
共 50 条
  • [31] A Probability-based Approach for Measuring External Attributes of Software Artifacts
    Morasca, Sandro
    ESEM: 2009 3RD INTERNATIONAL SYMPOSIUM ON EMPIRICAL SOFTWARE ENGINEERING AND MEASUREMENT, 2009, : 44 - 55
  • [32] Probability-based Analytical Method for Evaluating Basal Heave Failure in Braced Excavation
    Tang Yu-Geng
    Tung-Chin, Kung Gordon
    DISASTER ADVANCES, 2011, 4 (03): : 51 - 58
  • [33] A New Probability-based Multihop Broadcast Protocol for Vehicular Networks
    Zeng, Xuming
    Wang, Dianhong
    Yu, Ming
    Yang, Haojun
    PROCEEDINGS OF THE 2017 IEEE 14TH INTERNATIONAL CONFERENCE ON NETWORKING, SENSING AND CONTROL (ICNSC 2017), 2017, : 287 - 292
  • [34] PROTREC: A probability-based approach for recovering missing proteins based on biological networks
    Kong, Weijia
    Wong, Bertrand Jern Han
    Gao, Huanhuan
    Guo, Tiannan
    Liu, Xianming
    Du, Xiaoxian
    Wong, Limsoon
    Bin Goh, Wilson Wen
    JOURNAL OF PROTEOMICS, 2022, 250
  • [35] Probability-Based Structural Health Monitoring Through Markov Chain Monte Carlo Sampling
    Li, P. J.
    Xu, D. W.
    Zhang, J.
    INTERNATIONAL JOURNAL OF STRUCTURAL STABILITY AND DYNAMICS, 2016, 16 (07)
  • [36] A probability-based approach for the analysis of large-scale RNAi screens
    Koenig, Renate
    Chiang, Chih-Yuan
    Tu, Buu P.
    Yan, S. Frank
    DeJesus, Paul D.
    Romero, Angelica
    Bergauer, Tobias
    Orth, Anthony
    Krueger, Ute
    Zhou, Yingyao
    Chanda, Sumit K.
    NATURE METHODS, 2007, 4 (10) : 847 - 849
  • [37] Metric-based data quality assessment - Developing and evaluating a probability-based currency metric
    Heinrich, Bernd
    Klier, Mathias
    DECISION SUPPORT SYSTEMS, 2015, 72 : 82 - 96
  • [38] Utilization of Unlicensed Spectrum in Cognitive Radio Networks: A Probability-based Approach
    Omer, Ala Eldin
    Shubair, Raed M.
    2017 INTERNATIONAL CONFERENCE ON ELECTRICAL AND COMPUTING TECHNOLOGIES AND APPLICATIONS (ICECTA), 2017, : 696 - 701
  • [39] A probability-based approach for the analysis of large-scale RNAi screens
    Renate König
    Chih-yuan Chiang
    Buu P Tu
    S Frank Yan
    Paul D DeJesus
    Angelica Romero
    Tobias Bergauer
    Anthony Orth
    Ute Krueger
    Yingyao Zhou
    Sumit K Chanda
    Nature Methods, 2007, 4 : 847 - 849
  • [40] A Probability-Based Approach for Solving Shortest Path Problems in Gaussian Networks
    Abi-Char, Pierre E.
    Youssef, Ahmed
    2019 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2019,