A Multiple Feature Category Data Mining and Machine Learning Approach to Characterize and Detect Health Misinformation on Social Media

被引:6
|
作者
Safarnejad, Lida [1 ]
Xu, Qian [2 ]
Ge, Yaorong [3 ]
Chen, Shi [3 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] Elon Univ, Elon, NC 27244 USA
[3] Univ N Carolina, Charlotte, NC 28223 USA
关键词
Feature extraction; Social networking (online); Data mining; Measurement; Internet; Blogs; Support vector machines;
D O I
10.1109/MIC.2021.3063257
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this article, we characterize health misinformation infiltration as a dynamic dissemination process on social media in addition to content-based features. Using Zika discussion on Twitter in 2016 as the study system, we identified 264 most influential tweets with misinformation and matched 455 tweets with real information. We developed an algorithm to infer information dissemination network through retweeting for each tweet, and extracted nine network metrics. We then approximated information dissemination as nonhomogeneous Poisson process (NHPP) signal. We then extracted 40 signal features to characterize each NHPP. For content-based features, we applied both linguistic inquiry and word count and document-to-vector to further extract 63 and 50 features for each tweet, respectively. Finally, we also considered four user features. Based on these extracted feature categories, we trained support vector machine and random forest (RF) classifiers. Using all feature categories combined as input, an RF classifier achieved > 83% accuracy and > 90% AUC to detect misinformation.
引用
收藏
页码:43 / 51
页数:9
相关论文
共 50 条
  • [1] Introduction to the Data Analytics, Data Mining and Machine Learning for Social Media Minitrack
    Haughton, Dominique M.
    Xu, Jennifer J.
    Yates, David J.
    Yan, Xiangbin
    [J]. PROCEEDINGS OF THE 51ST ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES (HICSS), 2018, : 1741 - 1741
  • [2] Introduction to data analytics, data mining and machine learning for social media minitrack
    Yates, David
    Xu, Jennifer
    Mentzer, Kevin
    [J]. Proceedings of the Annual Hawaii International Conference on System Sciences, 2019, 2019-January
  • [3] Introduction to Data Analytics, Data Mining and Machine Learning for Social Media Minitrack
    Yates, David
    Xu, Jennifer
    Mentzer, Kevin
    [J]. PROCEEDINGS OF THE 52ND ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, 2019, : 2215 - 2215
  • [4] Social Media Mining to Detect Online Violent Extremism using Machine Learning Techniques
    Mussiraliyeva, Shynar
    Bagitova, Kalamkas
    Sultan, Daniyar
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (06) : 1384 - 1393
  • [5] A Machine Learning Based Approach for Opinion Mining on Social Network Data
    Arif, Fayeza
    Dulhare, Uma N.
    [J]. COMPUTER COMMUNICATION, NETWORKING AND INTERNET SECURITY, 2017, 5 : 135 - 147
  • [6] A Social Media-Machine Learning Approach to Detect Public Perception of Transportation Systems
    Nie, Qifan
    Zarobsky, Jake
    Sheinidashtegol, Pezhman
    Musaev, Aibek
    Graettinger, Andrew J.
    Lu, Weike
    Hu, Guojing
    [J]. CICTP 2023: INNOVATION-EMPOWERED TECHNOLOGY FOR SUSTAINABLE, INTELLIGENT, DECARBONIZED, AND CONNECTED TRANSPORTATION, 2023, : 773 - 784
  • [7] Unsupervised Machine Learning to Detect and Characterize Barriers to Pre-exposure Prophylaxis Therapy: Multiplatform Social Media Study
    Xu, Qing
    Nali, Matthew C.
    Mcmann, Tiana
    Godinez, Hector
    Li, Jiawei
    He, Yifan
    Cai, Mingxiang
    Lee, Christine
    Merenda, Christine
    Araojo, Richardae
    Mackey, Tim Ken
    [J]. JMIR INFODEMIOLOGY, 2022, 2 (01):
  • [8] Machine Learning Approach to Detect Fake News, Misinformation in COVID-19 Pandemic
    Bojjireddy, Sirisha
    Chun, Soon Ae
    Geller, James
    [J]. PROCEEDINGS OF THE 22ND ANNUAL INTERNATIONAL CONFERENCE ON DIGITAL GOVERNMENT RESEARCH, DGO 2021, 2021, : 575 - 578
  • [9] SocInf: Membership Inference Attacks on Social Media Health Data With Machine Learning
    Liu, Gaoyang
    Wang, Chen
    Peng, Kai
    Huang, Haojun
    Li, Yutong
    Cheng, Wenqing
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2019, 6 (05) : 907 - 921
  • [10] Social Media Data Mining of Antitobacco Campaign Messages: Machine Learning Analysis of Facebook Posts
    Lin, Shuo-Yu
    Cheng, Xiaolu
    Zhang, Jun
    Yannam, Jaya Sindhu
    Barnes, Andrew J.
    Koch, J. Randy
    Hayes, Rashelle
    Gimm, Gilbert
    Zhao, Xiaoquan
    Purohit, Hemant
    Xue, Hong
    [J]. JOURNAL OF MEDICAL INTERNET RESEARCH, 2023, 25