Word Embedding-Based Biomedical Text Summarization

被引:2
|
作者
Rouane, Oussama [1 ]
Belhadef, Hacene [1 ]
Bouakkaz, Mustapha [2 ]
机构
[1] Univ Constantine 2 Abdelhamid Mehri, Constantine, Algeria
[2] Univ Amar Telidgi, Comp Sci Dept, Fac Sci, Laghouat, Algeria
关键词
Biomedical text summarization; Word embedding; Word2vec; PageRank algorithm; ROUGE metrics;
D O I
10.1007/978-3-030-33582-3_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we have proposed a novel word embedding-based biomedical text summarizer. Biomedical words are represented by real dense vectors. Sentences are represented by summing-up the word vectors that contain. The PageRank algorithm is applied to rank sentences using the cosine similarity as a distance measure between sentences vectors. The top N highly ranked sentences are selected to build the summary. For the evaluation, we created a corpus of 200 biomedical papers downloaded from the Biomed Central full-text database. We used a pre-trained Word2vec model of word vectors generated from a combination of PubMed, PMC, and recent English Wikipedia dump texts. We compared our method with four other summarizers using: ROUGE-1, ROUGE-2, ROUGE-3, and ROUGE-SU4 metrics by evaluating the generated summaries with the abstracts of papers. Our summarizer achieved an improvement of 3.48%, 7.68%, 9.76%, and 3.47% respectively against the second-ranked summarizer.
引用
下载
收藏
页码:288 / 297
页数:10
相关论文
共 50 条
  • [31] Enhancing Embedding-Based Chinese Word Similarity Evaluation with Concepts and Synonyms Knowledge
    Yin, Fulian
    Wang, Yanyan
    Liu, Jianbo
    Ji, Meiqi
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2020, 124 (02): : 747 - 764
  • [32] GLTM: A Global and Local Word Embedding-Based Topic Model for Short Texts
    Liang, Wenxin
    Feng, Ran
    Liu, Xinyue
    Li, Yuangang
    Zhang, Xianchao
    IEEE ACCESS, 2018, 6 : 43612 - 43621
  • [33] Word Embedding-based Method for Entity Category Alignment of Geographic Knowledge Base
    Xu Z.
    Zhu Y.
    Song J.
    Sun K.
    Wang S.
    Zhu, Yunqiang (zhuyq@igsnrr.ac.cn); Zhu, Yunqiang (zhuyq@igsnrr.ac.cn), 1600, Science Press (23): : 1372 - 1381
  • [34] Document vector embedding based extractive text summarization system for Hindi and English text
    Rani, Ruby
    Lobiyal, D. K.
    APPLIED INTELLIGENCE, 2022, 52 (08) : 9353 - 9372
  • [35] Using word sequences for text summarization
    Villatoro-Tello, Esau
    Villasenor-Pineda, Luis
    Montes-y-Gomez, Manuel
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2006, 4188 : 293 - 300
  • [36] Document vector embedding based extractive text summarization system for Hindi and English text
    Ruby Rani
    D. K. Lobiyal
    Applied Intelligence, 2022, 52 : 9353 - 9372
  • [37] On the Role of Text Preprocessing in BERT Embedding-based DNNs for Classifying Informal Texts
    Kurniasih A.
    Manik L.P.
    International Journal of Advanced Computer Science and Applications, 2022, 13 (06) : 927 - 934
  • [38] Extractive Myanmar News Summarization Using Centroid Based Word Embedding
    Lwin, Soe Soe
    Nwet, Khin Thandar
    2019 INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION TECHNOLOGIES (ICAIT), 2019, : 200 - 205
  • [39] Measuring text similarity based on structure and word embedding
    Farouk, Mamdouh
    COGNITIVE SYSTEMS RESEARCH, 2020, 63 : 1 - 10
  • [40] An Integrated Word Embedding-Based Dual-Task Learning Method for Sentiment Analysis
    Yanping Fu
    Yun Liu
    Sheng-Lung Peng
    Arabian Journal for Science and Engineering, 2020, 45 : 2571 - 2586