Word Embedding-Based Biomedical Text Summarization

被引:2
|
作者
Rouane, Oussama [1 ]
Belhadef, Hacene [1 ]
Bouakkaz, Mustapha [2 ]
机构
[1] Univ Constantine 2 Abdelhamid Mehri, Constantine, Algeria
[2] Univ Amar Telidgi, Comp Sci Dept, Fac Sci, Laghouat, Algeria
关键词
Biomedical text summarization; Word embedding; Word2vec; PageRank algorithm; ROUGE metrics;
D O I
10.1007/978-3-030-33582-3_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we have proposed a novel word embedding-based biomedical text summarizer. Biomedical words are represented by real dense vectors. Sentences are represented by summing-up the word vectors that contain. The PageRank algorithm is applied to rank sentences using the cosine similarity as a distance measure between sentences vectors. The top N highly ranked sentences are selected to build the summary. For the evaluation, we created a corpus of 200 biomedical papers downloaded from the Biomed Central full-text database. We used a pre-trained Word2vec model of word vectors generated from a combination of PubMed, PMC, and recent English Wikipedia dump texts. We compared our method with four other summarizers using: ROUGE-1, ROUGE-2, ROUGE-3, and ROUGE-SU4 metrics by evaluating the generated summaries with the abstracts of papers. Our summarizer achieved an improvement of 3.48%, 7.68%, 9.76%, and 3.47% respectively against the second-ranked summarizer.
引用
下载
收藏
页码:288 / 297
页数:10
相关论文
共 50 条
  • [21] Word Embedding-Based Automatic MT Evaluation Metric using Word Position Information
    Echizen'ya, Hiroshi
    Araki, Kenji
    Hovy, Eduard
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 1874 - 1883
  • [22] Graph-based abstractive biomedical text summarization
    Givchi, Azadeh
    Ramezani, Reza
    Baraani-Dastjerdi, Ahmad
    JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 132
  • [23] Automatic Construction of Educational Knowledge Graphs: A Word Embedding-Based Approach
    Ain, Qurat Ul
    Chatti, Mohamed Amine
    Bakar, Komlan Gluck Charles
    Joarder, Shoeb
    Alatrash, Rawaa
    INFORMATION, 2023, 14 (10)
  • [24] Automatic assignment of microgenres to movies using a word embedding-based approach
    Gonzalez-Santos, Carlos
    Vega-Rodriguez, Miguel A.
    Lopez-Munoz, Joaquin M.
    Martinez-Sarriegui, Inaki
    Perez, Carlos J.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (16) : 48719 - 48735
  • [25] Automatic assignment of microgenres to movies using a word embedding-based approach
    Carlos González-Santos
    Miguel A. Vega-Rodríguez
    Joaquín M. López-Muñoz
    Iñaki Martínez-Sarriegui
    Carlos J. Pérez
    Multimedia Tools and Applications, 2024, 83 : 48719 - 48735
  • [26] A word embedding-based approach to cross-lingual topic modeling
    Chia-Hsuan Chang
    San-Yih Hwang
    Knowledge and Information Systems, 2021, 63 : 1529 - 1555
  • [27] Hybrid embedding-based text representation for hierarchical multi-label text classification
    Ma, Yinglong
    Liu, Xiaofeng
    Zhao, Lijiao
    Liang, Yue
    Zhang, Peng
    Jin, Beihong
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 187
  • [28] A word embedding-based approach to cross-lingual topic modeling
    Chang, Chia-Hsuan
    Hwang, San-Yih
    KNOWLEDGE AND INFORMATION SYSTEMS, 2021, 63 (06) : 1529 - 1555
  • [29] Exploiting Class Labels to Boost Performance on Embedding-based Text Classification
    Zubiaga, Arkaitz
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 3357 - 3360
  • [30] Text Semantic Steganalysis Based on Word Embedding
    Zuo, Xin
    Hu, Huanhuan
    Zhang, Weiming
    Yu, Nenghai
    CLOUD COMPUTING AND SECURITY, PT IV, 2018, 11066 : 485 - 495