Mining Twitter for Adverse Drug Reaction Mentions: A Corpus and Classification Benchmark

被引:0
|
作者
Ginn, Rachel [1 ]
Pimpalkhute, Pranoti [1 ]
Nikfarjam, Azadeh [1 ]
Patki, Apurv [1 ]
O'Connor, Karen [1 ]
Sarker, Abeed [1 ]
Smith, Karen [2 ]
Gonzalez, Graciela [1 ]
机构
[1] Arizona State Univ, Tempe, AZ 85281 USA
[2] Regis Univ, Denver, CO USA
关键词
adverse drug reactions; twitter; social media; mining; machine learning; biomedicine; pharmacovigilance; classification; natural language processing; AGREEMENT;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
With many adults using social media to discuss health information, researchers have begun diving into this resource to monitor or detect health conditions on a population level. Twitter, specifically, has flourished to several hundred million users and could present a rich information source for the detection of serious medical conditions, like adverse drug reactions (ADRs). However, Twitter also presents unique challenges due to brevity, lack of structure, and informal language. We present a freely available, manually annotated corpus of 10,822 tweets, which can be used to train automated tools to mine Twitter for ADRs. We collected tweets utilizing drug names as keywords, but expanding them by applying an algorithm to generate misspelled versions of the drug names for maximum coverage. We annotated each tweet for the presence of a mention of an ADR, and for those that had one, annotated the mention (including span and UMLS IDs of the ADRs). Our inter-annotator agreement for the binary classification had a Kappa value of 0.69, which may be considered substantial (Viera & Garrett, 2005). We evaluated the utility of the corpus by training two classes of machine learning algorithms: Naive Bayes and Support Vector Machines. The results we present validate the usefulness of the corpus for automated mining tasks. The classification corpus is available from http://diego.asu.edu/downloads.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Twitter Opinion Mining for Adverse Drug Reactions
    Wu, Liang
    Moh, Teng-Sheng
    Khuri, Natalia
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 1570 - 1574
  • [2] Detection of Adverse Drug Reaction mentions in tweets using ELMo
    Sarabadani, Sarah
    [J]. SOCIAL MEDIA MINING FOR HEALTH APPLICATIONS (#SMM4H) WORKSHOP & SHARED TASK, 2019, : 120 - 122
  • [3] DeepADEMiner: a deep learning pharmacovigilance pipeline for extraction and normalization of adverse drug event mentions on Twitter
    Magge, Arjun
    Tutubalina, Elena
    Miftahutdinov, Zulfat
    Alimova, Ilseyar
    Dirkson, Anne
    Verberne, Suzan
    Weissenbacher, Davy
    Gonzalez-Hernandez, Graciela
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (10) : 2184 - 2192
  • [4] Deep Learning for Identification of Adverse Effect Mentions in Twitter Data
    Barry, Paul
    Uzuner, Ozlem
    [J]. SOCIAL MEDIA MINING FOR HEALTH APPLICATIONS (#SMM4H) WORKSHOP & SHARED TASK, 2019, : 99 - 101
  • [5] Co-training for Extraction of Adverse Drug Reaction Mentions from Tweets
    Gupta, Shashank
    Gupta, Manish
    Varma, Vasudeva
    Pawar, Sachin
    Ramrakhiyani, Nitin
    Palshikar, Girish Keshav
    [J]. ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 556 - 562
  • [6] DeepCADRME: A deep neural model for complex adverse drug reaction mentions extraction
    El-allaly, Ed-drissiya
    Sarrouti, Mourad
    En-Nahnahi, Noureddine
    El Alaoui, Said Ouatik
    [J]. PATTERN RECOGNITION LETTERS, 2021, 143 : 27 - 35
  • [7] Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features
    Nikfarjam, Azadeh
    Sarker, Abeed
    O'Connor, Karen
    Ginn, Rachel
    Gonzalez, Graciela
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2015, 22 (03) : 671 - 681
  • [8] Multi-task Learning for Extraction of Adverse Drug Reaction Mentions from Tweets
    Gupta, Shashank
    Gupta, Manish
    Varma, Vasudeva
    Pawar, Sachin
    Ramrakhiyani, Nitin
    Palshikar, Girish Keshav
    [J]. ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 59 - 71
  • [9] Portable automatic text classification for adverse drug reaction detection via multi-corpus training
    Sarker, Abeed
    Gonzalez, Graciela
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 53 : 196 - 207
  • [10] Exploring Brand-Name Drug Mentions on Twitter for Pharmacovigilance
    Carbonell, Pablo
    Mayer, Miguel A.
    Bravo, Alex
    [J]. DIGITAL HEALTHCARE EMPOWERING EUROPEANS, 2015, 210 : 55 - 59