A method for automatic detection of acronyms in texts and building a dataset for acronym disambiguation

被引:3
|
作者
Azimi, Sasan [1 ]
Veisi, Hadi [1 ]
Amouie, Reyhaneh [1 ]
机构
[1] Univ Tehran, Tehran, Iran
关键词
Acronym disambiguation; Tech-mining; Text Mining; Natural Language Processing;
D O I
10.1109/icspis48872.2019.9066084
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, there is an increasing tendency for using acronyms in technical texts, which has led to ambiguous acronyms with different possible expansions. Diversity of expansions of a single acronym makes recognizing its expansion a challenging task. Replacing acronyms with incorrect expansions will lead to problems in text mining procedures, namely text normalization, summarization, machine translation, and tech-mining. Tech-mining involves exploring and analyzing technical texts to recognize the relations between technologies. This paper is aimed at proposing a method for building a dataset that meets the requirements for training acronym disambiguation models in technical texts. In this paper, challenges in automatic acronym disambiguation are presented. We have proposed a method for building the dataset and the accuracy of the acronym disambiguation model is 86%.
引用
收藏
页数:4
相关论文
共 50 条
  • [1] Identification, expansion, and disambiguation of acronyms in biomedical texts
    Bracewell, DB
    Russell, S
    Wu, AS
    PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS - ISPA 2005 WORKSHOPS, 2005, 3759 : 186 - 195
  • [2] An Unsupervised Clinical Acronym Disambiguation Method Based on Pretrained Language Model
    Wei, Siwen
    Yuan, Chi
    Li, Zixuan
    Wang, Huaiyu
    HEALTH INFORMATION PROCESSING, CHIP 2023, 2023, 1993 : 270 - 284
  • [3] An Automatic Corpus Based Method for a Building Multiple Fuzzy Word Dataset
    Chandran, D.
    Crockett, K. A.
    Mclean, D.
    Crispin, A.
    2015 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2015), 2015,
  • [4] Automatic disambiguation of Latin abbreviations in early modern texts for humanities digital libraries
    Rydberg-Cox, JA
    2003 JOINT CONFERENCE ON DIGITAL LIBRARIES, PROCEEDINGS, 2003, : 372 - 373
  • [5] AutoElbow: An Automatic Elbow Detection Method for Estimating the Number of Clusters in a Dataset
    Onumanyi, Adeiza James
    Molokomme, Daisy Nkele
    Isaac, Sherrin John
    Abu-Mahfouz, Adnan M.
    APPLIED SCIENCES-BASEL, 2022, 12 (15):
  • [6] A Method to Automatic Create Dataset for Training Object Detection Neural Networks
    Zhou, Shi
    Yang, Zijun
    Zhu, Miaomiao
    Li, He
    Serikawa, Seiichi
    Mizumachi, Mitsunori
    Zhang, Lifeng
    IEEE Access, 2022, 10 : 80505 - 80517
  • [7] A Method to Automatic Create Dataset for Training Object Detection Neural Networks
    Zhou, Shi
    Yang, Zijun
    Zhu, Miaomiao
    Li, He L.
    Serikawa, Seiichi
    Mizumachi, Mitsunori
    Zhang, Lifeng
    IEEE ACCESS, 2022, 10 : 80505 - 80517
  • [8] An Automatic Pipeline For Building Emotional Speech Dataset
    Ngoc-Anh Nguyen Thi
    Bao Thang Ta
    Nhat Minh Le
    Van Hai Do
    2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1030 - 1035
  • [9] A dataset for automatic violence detection in videos
    Bianculli, Miriana
    Falcionelli, Nicola
    Sernani, Paolo
    Tomassini, Selene
    Contardo, Paolo
    Lombardi, Mara
    Dragoni, Aldo Franco
    DATA IN BRIEF, 2020, 33
  • [10] Automatic Detection of Antisocial Behaviour in Texts
    Munezero, Myriam
    Montero, Calkin Suero
    Kakkonen, Tuomo
    Sutinen, Erkki
    Mozgovoy, Maxim
    Klyuev, Vitaly
    INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2014, 38 (01): : 3 - 10