Sweet tweets! Evaluating a new approach for probability-based sampling of Twitter

被引:3
|
作者
Buskirk, Trent D. [1 ]
Blakely, Brian P. [1 ]
Eck, Adam [1 ]
McGrath, Richard [1 ]
Singh, Ravinder [1 ]
Yu, Youzhi [1 ]
机构
[1] Bowling Green State Univ, Bowling Green, OH 43403 USA
关键词
Twitter; Probability sampling; Tweets; Social media; COVID-19; Big data; Survey research;
D O I
10.1140/epjds/s13688-022-00321-1
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
As survey costs continue to rise and response rates decline, researchers are seeking more cost-effective ways to collect, analyze and process social and public opinion data. These issues have created an opportunity and interest in expanding the fit-for-purpose paradigm to include alternate sources such as passively collected sensor data and social media data. However, methods for accessing, sourcing and sampling social media data are just now being developed. In fact, there has been a small but growing body of literature focusing on comparing different Twitter data access methods through either the elaborate firehose or the free Twitter search or streaming APIs. Missing from the literature is a good understanding of how to randomly sample Tweets to produce datasets that are representative of the daily discourse, especially within geographical regions of interest, without requiring a census of all Tweets. This understanding is necessary for producing quality estimates of public opinion from social media sources such as Twitter. To address this gap, we propose and test the Velocity-Based Estimation for Sampling Tweets (VBEST) algorithm for selecting a probability based sample of tweets. We compare the performance of VBEST sample estimates to other methods of accessing Twitter through the Search API on the distribution of total Tweets as well as COVID-19 keyword incidence and frequency and find that the VBEST samples produce consistent and relatively low levels of overall bias compared to common methods of access through the Search API across many experimental conditions.
引用
收藏
页数:32
相关论文
共 50 条
  • [21] Evaluating Machine Learning Models for Multimodal Probability-Based Energy Forecasting
    Sadu, Vijaya Bhaskar
    Kumar, R. Santhi
    Kumar, B. Srinivasa
    Kavitha, T.
    Chapala, Hari Kishan
    Chakravarthi, M. Kalyan
    PROCESS INTEGRATION AND OPTIMIZATION FOR SUSTAINABILITY, 2024, 8 (04) : 1209 - 1222
  • [22] A probability-based sampling approach for the analysis of drug seizures composed of multiple containers of either cocaine, heroin, or Cannabis
    Mario, John R.
    FORENSIC SCIENCE INTERNATIONAL, 2010, 197 (1-3) : 105 - 113
  • [23] Node sampling technique to speed up probability-based power estimation methods
    Choi, H
    Hwang, SH
    ELECTRONICS LETTERS, 1998, 34 (13) : 1286 - 1287
  • [24] Node sampling technique to speed up probability-based power estimation methods
    Choi, H
    Kim, HS
    Park, IC
    Hwang, SH
    Kyung, CM
    PROCEEDINGS OF ASP-DAC '99: ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE 1999, 1999, : 157 - 160
  • [25] NONPROBABILITY VERSUS PROBABILITY-BASED PATIENT SAMPLING: UNDERSTANDING RESPONSE VARIATION AND IMPLICATIONS
    Sulham, K.
    Peugh, J.
    Edge, J.
    DiSogra, C. A.
    Garfield, S.
    VALUE IN HEALTH, 2012, 15 (07) : A480 - A480
  • [26] A probability-based approach for predicting particle crushing of granular soil
    Zhou, Bo
    Wang, Jianfeng
    Geomechanics from Micro to Macro, Vols I and II, 2015, : 291 - 295
  • [27] A PROBABILITY-BASED CONTINGENCY MANAGEMENT APPROACH USING LINE INDICES
    Mishra, Akanksha
    Kumar, Gundavarapu V. Nagesh
    INTERNATIONAL JOURNAL OF POWER AND ENERGY SYSTEMS, 2016, 36 (02): : 45 - 53
  • [28] Investigation and management of pulmonary embolism 1: a probability-based approach
    Stolberg, Stephanie
    Mudawi, Dalia
    Dean, Katrina
    Cheng, Andrew
    Barraclough, Richard
    BRITISH JOURNAL OF HOSPITAL MEDICINE, 2021, 82 (07)
  • [29] Measuring the similarity for heterogenous data: An ordered probability-based approach
    Le, S
    Ho, TB
    DISCOVERY SCIENCE, PROCEEDINGS, 2004, 3245 : 129 - 141
  • [30] Targets and limits for management of fisheries: A simple probability-based approach
    Prager, MH
    Porch, CE
    Shertzer, KW
    Caddy, JF
    NORTH AMERICAN JOURNAL OF FISHERIES MANAGEMENT, 2003, 23 (02) : 349 - 361