Lexical Normalization of User-Generated Medical Forum Data

被引:0
|
作者
Dirkson, Anne [1 ]
Verberne, Suzan [1 ]
Kraaij, Wessel [1 ]
机构
[1] Leiden Univ, LIACS, Niels Bohrweg 1, Leiden, Netherlands
关键词
CORPUS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the medical domain, user-generated social media text is increasingly used as a valuable complementary knowledge source to scientific medical literature. The extraction of this knowledge is complicated by colloquial language use and misspellings. Yet, lexical normalization of such data has not been addressed properly. This paper presents an unsupervised, data-driven spelling correction module for medical social media. Our method outperforms state-of-the-art spelling correction and can detect mistakes with an F-0.5 of 0.888. Additionally, we present a novel corpus for spelling mistake detection and correction on a medical patient forum.
引用
收藏
页码:11 / 20
页数:10
相关论文
共 50 条
  • [31] Self-organising management of user-generated data and knowledge
    Macbeth, Sam
    Pitt, Jeremy V.
    [J]. KNOWLEDGE ENGINEERING REVIEW, 2015, 30 (03): : 237 - 264
  • [32] From medical images to flow computations without user-generated meshes
    Dillard, Seth I.
    Mousel, John A.
    Shrestha, Liza
    Raghavan, Madhavan L.
    Vigmostad, Sarah C.
    [J]. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING, 2014, 30 (10) : 1057 - 1083
  • [33] Development of mobile applications for user-generated knowledge in postgraduate medical training
    Entwicklung mobiler Anwendungen für nutzergeneriertes Wissen in der ärztlichen Weiterbildung
    [J]. 1600, Walter de Gruyter GmbH (12):
  • [34] From user-generated content to a user-generated aesthetic: Instagram, corporate vernacularization, and the intimate life of brands
    Simatzkin-Ohana, Liron
    Frosh, Paul
    [J]. MEDIA CULTURE & SOCIETY, 2022, 44 (07) : 1235 - 1254
  • [35] Data-Driven Lexical Normalization for Medical Social Media
    Dirkson, Anne
    Verberne, Suzan
    Sarker, Abeed
    Kraaij, Wessel
    [J]. MULTIMODAL TECHNOLOGIES AND INTERACTION, 2019, 3 (03)
  • [36] SInFo - Structure-Driven Incremental Forum Crawler That Optimizes User-Generated Content Retrieval
    Pavkovic, Milos
    Protic, Jelica
    [J]. IEEE ACCESS, 2019, 7 : 126941 - 126961
  • [37] ViComp: composition of user-generated videos
    Bano, Sophia
    Cavallaro, Andrea
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (12) : 7187 - 7210
  • [38] User-Generated Content in Pervasive Games
    Kasapakis, Vlasios
    Gavalas, Damianos
    [J]. COMPUTERS IN ENTERTAINMENT, 2018, 16 (01):
  • [39] User-generated "content":: This is the promised land?
    Crawford, W
    [J]. ECONTENT, 2001, 24 (08) : 50 - 51
  • [40] Quality Characteristics for User-Generated Content
    Musto, Jiri
    Dahanayake, Ajantha
    [J]. Musto, Jiri (jiri.musto@lut.fi), 1600, IOS Press BV (343): : 244 - 263