Word Embedding-Based Biomedical Text Summarization

被引:2
|
作者
Rouane, Oussama [1 ]
Belhadef, Hacene [1 ]
Bouakkaz, Mustapha [2 ]
机构
[1] Univ Constantine 2 Abdelhamid Mehri, Constantine, Algeria
[2] Univ Amar Telidgi, Comp Sci Dept, Fac Sci, Laghouat, Algeria
关键词
Biomedical text summarization; Word embedding; Word2vec; PageRank algorithm; ROUGE metrics;
D O I
10.1007/978-3-030-33582-3_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we have proposed a novel word embedding-based biomedical text summarizer. Biomedical words are represented by real dense vectors. Sentences are represented by summing-up the word vectors that contain. The PageRank algorithm is applied to rank sentences using the cosine similarity as a distance measure between sentences vectors. The top N highly ranked sentences are selected to build the summary. For the evaluation, we created a corpus of 200 biomedical papers downloaded from the Biomed Central full-text database. We used a pre-trained Word2vec model of word vectors generated from a combination of PubMed, PMC, and recent English Wikipedia dump texts. We compared our method with four other summarizers using: ROUGE-1, ROUGE-2, ROUGE-3, and ROUGE-SU4 metrics by evaluating the generated summaries with the abstracts of papers. Our summarizer achieved an improvement of 3.48%, 7.68%, 9.76%, and 3.47% respectively against the second-ranked summarizer.
引用
下载
收藏
页码:288 / 297
页数:10
相关论文
共 50 条
  • [1] Word Embedding-based Text Processing for Comprehensive Summarization and Distinct Information Extraction
    Wan, Xiangpeng
    Ghazzai, Hakim
    Massoud, Yehia
    2020 IEEE TECHNOLOGY & ENGINEERING MANAGEMENT CONFERENCE (TEMSCON 2020), 2020,
  • [2] A weighted word embedding based approach for extractive text summarization
    Rani, Ruby
    Lobiyal, Daya K.
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 186
  • [3] A Framework for Word Embedding Based Automatic Text Summarization and Evaluation
    Hailu, Tulu Tilahun
    Yu, Junqing
    Fantaye, Tessfu Geteye
    INFORMATION, 2020, 11 (02)
  • [4] Text document summarization using word embedding
    Mohd, Mudasir
    Jan, Rafiya
    Shah, Muzaffar
    EXPERT SYSTEMS WITH APPLICATIONS, 2020, 143 (143)
  • [5] Word Embedding-based Approach to Aspect Detection for Aspect-based Summarization of Persian Customer Reviews
    Razavi, Seyyed Aref
    Asadpour, Masoud
    PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON INTERNET OF THINGS AND MACHINE LEARNING (IML'17), 2017,
  • [6] Extractive Text Summarization using Word Vector Embedding
    Jain, Aditya
    Bhatia, Divij
    Thakur, Manish K.
    2017 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND DATA SCIENCE (MLDS 2017), 2017, : 51 - 55
  • [7] Deep learning- and word embedding-based heterogeneous classifier ensembles for text classification
    Kilimci Z.H.
    Akyokus S.
    Complexity, 2018, 2018
  • [8] Deep Learning- and Word Embedding-Based Heterogeneous Classifier Ensembles for Text Classification
    Kilimci, Zeynep H.
    Akyokus, Seim
    COMPLEXITY, 2018,
  • [9] Word Embedding-Based Topic Similarity Measures
    Terragni, Silvia
    Fersini, Elisabetta
    Messina, Enza
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2021), 2021, 12801 : 33 - 45
  • [10] Extractive Arabic Text Summarization Using PageRank and Word Embedding
    Alselwi, Ghadir
    Tasci, Tugrul
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024, 49 (09) : 13115 - 13130