Unsupervised extractive multi-document summarization method based on transfer learning from BERT multi-task fine-tuning

被引:21
|
作者
Lamsiyah, Salima [1 ]
El Mahdaouy, Abdelkader [3 ]
Ouatik, Said El Alaoui [1 ,2 ]
Espinasse, Bernard [4 ]
机构
[1] Sidi Mohamed Ben Abdellah Univ, FSDM, Lab Informat Signals Automat & Cognitivism, BP 1796, Fez Atlas 30003, Morocco
[2] Ibn Tofail Univ, Natl Sch Appl Sci, Lab Engn Sci, Kenitra, Morocco
[3] Mohammed VI Polytech Univ UM6P, Sch Comp Sci UM6P CS, Ben Guerir, Morocco
[4] Univ Toulon & Var, Aix Marseille Univ, CNRS, LIS,UMR 7020, Toulon, France
关键词
BERT fine-tuning; multi-document summarization; multi-task learning; sentence representation learning; transfer learning; SENTENCE SCORING TECHNIQUES;
D O I
10.1177/0165551521990616
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Text representation is a fundamental cornerstone that impacts the effectiveness of several text summarization methods. Transfer learning using pre-trained word embedding models has shown promising results. However, most of these representations do not consider the order and the semantic relationships between words in a sentence, and thus they do not carry the meaning of a full sentence. To overcome this issue, the current study proposes an unsupervised method for extractive multi-document summarization based on transfer learning from BERT sentence embedding model. Moreover, to improve sentence representation learning, we fine-tune BERT model on supervised intermediate tasks from GLUE benchmark datasets using single-task and multi-task fine-tuning methods. Experiments are performed on the standard DUC'2002-2004 datasets. The obtained results show that our method has significantly outperformed several baseline methods and achieves a comparable and sometimes better performance than the recent state-of-the-art deep learning-based methods. Furthermore, the results show that fine-tuning BERT using multi-task learning has considerably improved the performance.
引用
收藏
页码:164 / 182
页数:19
相关论文
共 50 条
  • [1] Unsupervised Framework for Comment-based Multi-document Extractive Summarization
    Roha, Vishal Singh
    Saini, Naveen
    Saha, Sriparna
    Moreno, Jose G.
    PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'22), 2022, : 574 - 582
  • [2] An unsupervised method for extractive multi-document summarization based on centroid approach and sentence embeddings
    Lamsiyah, Salima
    El Mahdaouy, Abdelkader
    Espinasse, Bernard
    Ouatik, Said El Alaoui
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 167
  • [3] A Spectral Method for Unsupervised Multi-Document Summarization
    Wang, Kexiang
    Chang, Baobao
    Sui, Zhifang
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 435 - 445
  • [4] Multi-document summarization based on unsupervised clustering
    Ji, Paul
    INFORMATION RETRIEVAL TECHNOLOLGY, PROCEEDINGS, 2006, 4182 : 560 - 566
  • [5] Multi-Task Learning for Abstractive and Extractive Summarization
    Chen, Yangbin
    Ma, Yun
    Mao, Xudong
    Li, Qing
    DATA SCIENCE AND ENGINEERING, 2019, 4 (01) : 14 - 23
  • [6] Multi-Task Learning for Abstractive and Extractive Summarization
    Yangbin Chen
    Yun Ma
    Xudong Mao
    Qing Li
    Data Science and Engineering, 2019, 4 (1) : 14 - 23
  • [7] Unsupervised extractive multi-document text summarization using a Genetic Algorithm
    Neri-Mendoza, Veronica
    Ledeneva, Yulia
    Garcia-Hernandez, Rene Arnulfo
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 2397 - 2408
  • [8] Mining Topically Coherent Patterns for Unsupervised Extractive Multi-document Summarization
    Wu, Yutong
    Li, Yuefeng
    Xu, Yue
    Huang, Wei
    2016 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2016), 2016, : 129 - 136
  • [9] Sentiment-aware Review Summarization with Personalized Multi-task Fine-tuning
    Xu, Hongyan
    Liu, Hongtao
    Lv, Zhepeng
    Yang, Qing
    Wang, Wenjun
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 2826 - 2835
  • [10] Multi-document extractive text summarization based on firefly algorithm
    Tomer, Minakshi
    Kumar, Manoj
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (08) : 6057 - 6065