Design and analysis of a large-scale COVID-19 tweets dataset

被引:102
|
作者
Lamsal, Rabindra [1 ]
机构
[1] Jawaharlal Nehru Univ, Sch Comp & Syst Sci, New Delhi 110067, India
关键词
Social computing; Crisis computing; Sentiment analysis; Network analysis; Twitter data; TWITTER; SENTIMENT; TIME;
D O I
10.1007/s10489-020-02029-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As of July 17, 2020, more than thirteen million people have been diagnosed with the Novel Coronavirus (COVID-19), and half a million people have already lost their lives due to this infectious disease. The World Health Organization declared the COVID-19 outbreak as a pandemic on March 11, 2020. Since then, social media platforms have experienced an exponential rise in the content related to the pandemic. In the past, Twitter data have been observed to be indispensable in the extraction of situational awareness information relating to any crisis. This paper presents COV19Tweets Dataset (Lamsal 2020a), a large-scale Twitter dataset with more than 310 million COVID-19 specific English language tweets and their sentiment scores. The dataset's geo version, the GeoCOV19Tweets Dataset (Lamsal 2020b), is also presented. The paper discusses the datasets' design in detail, and the tweets in both the datasets are analyzed. The datasets are released publicly, anticipating that they would contribute to a better understanding of spatial and temporal dimensions of the public discourse related to the ongoing pandemic. As per the stats, the datasets (Lamsal 2020a, 2020b) have been accessed over 74.5k times, collectively.
引用
收藏
页码:2790 / 2804
页数:15
相关论文
共 50 条
  • [31] Risk Assessment of Large-Scale Sports Events in the Context of COVID-19
    Wang, Yiwei
    Xie, Ming
    Xie, Xiaowen
    Wang, Zhipeng
    Wang, Min
    Zhan, Xiuxiu
    Liu, Chuang
    Zhang, Zike
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2022, 51 (06): : 937 - 946
  • [32] Impact of the COVID-19 pandemic on the Internet latency: A large-scale study
    Candela, Massimo
    Luconi, Valerio
    Vecchio, Alessio
    COMPUTER NETWORKS, 2020, 182
  • [33] Identifying Drug Candidates for COVID-19 with Large-Scale Drug Screening
    Wu, Yifei
    Pegan, Scott D. D.
    Crich, David
    Lou, Lei
    Mullininx, Lauren Nicole
    Starling, Edward B. B.
    Booth, Carson
    Chishom, Andrew Edward
    Chang, Kuan Y. Y.
    Xie, Zhong-Ru
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (05)
  • [34] Predicting COVID-19 Spread from Large-Scale Mobility Data
    Schwabe, Amray
    Persson, Joel
    Feuerriegel, Stefan
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 3531 - 3539
  • [36] Cost-effectiveness analysis on COVID-19 surveillance strategy of large-scale sports competition
    Wang, Xuechun
    Cai, Yiru
    Zhang, Bo
    Zhang, Xiangyu
    Wang, Lianhao
    Yan, Xiangyu
    Zhao, Mingchen
    Zhang, Yuan
    Jia, Zhongwei
    INFECTIOUS DISEASES OF POVERTY, 2022, 11 (01)
  • [37] Cost-effectiveness analysis on COVID-19 surveillance strategy of large-scale sports competition
    Xuechun Wang
    Yiru Cai
    Bo Zhang
    Xiangyu Zhang
    Lianhao Wang
    Xiangyu Yan
    Mingchen Zhao
    Yuan Zhang
    Zhongwei Jia
    Infectious Diseases of Poverty, 11
  • [38] Cost-effectiveness analysis on COVID-19 surveillance strategy of large-scale sports competition
    Wang Xuechun
    Cai Yiru
    Zhang Bo
    Zhang Xiangyu
    Wang Lianhao
    Yan Xiangyu
    Zhao Mingchen
    Zhang Yuan
    Jia Zhongwei
    贫困所致传染病(英文), 2022, 11 (02) : 53 - 62
  • [39] Sentimental and spatial analysis of COVID-19 vaccines tweets
    Areeba Umair
    Elio Masciari
    Journal of Intelligent Information Systems, 2023, 60 : 1 - 21
  • [40] Geospatial analysis of misinformation in COVID-19 related tweets
    Forati, Amir Masoud
    Ghose, Rina
    APPLIED GEOGRAPHY, 2021, 133