Unified extractive-abstractive summarization: a hybrid approach utilizing BERT and transformer models for enhanced document summarization

被引:0
|
作者
Divya, S. [1 ]
Sripriya, N. [1 ]
Andrew, J. [2 ]
Mazzara, Manuel [3 ]
机构
[1] Department of Information Technology, SSN College of Engineering, Tamil Nadu, Kalavakkam, India
[2] Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Karnataka, Manipal, India
[3] Institute of Software Development and Engineering, Innopolis University, Innopolis, Russia
关键词
D O I
10.7717/peerj-cs.2424
中图分类号
G2 [信息与知识传播];
学科分类号
05 ; 0503 ;
摘要
With the exponential proliferation of digital documents, there arises a pressing need for automated document summarization (ADS). Summarization, a compression technique, condenses a source document into concise sentences that encapsulate its salient information for summary generation. A primary challenge lies in crafting a dependable summary, contingent upon both extracted features and human-established parameters. This article introduces an intelligent methodology that seamlessly integrates extractive and abstractive techniques to ensure heightened relevance between the input document and its summary. Initially, input sentences undergo transformation into representations utilizing BERT, subsequently transposed into a symmetric matrix based on their similarity. Semantically congruent sentences are then extracted from this matrix to construct an extractive summary. The transformer model integrates an objective function highly symmetric and invariant under unitary transformation for language generation. This model refines the extracted informative sentences and generates an abstractive summary akin to manually crafted summaries. Employing this hybrid summarization technique on the CNN/DailyMail dataset and DUC2004, we evaluate its efficacy using ROUGE metrics. Results demonstrate the superiority of our proposed technique over conventional summarization methods. © 2024 S et al.
引用
收藏
页码:1 / 26
相关论文
共 50 条
  • [1] On Extractive and Abstractive Neural Document Summarization with Transformer Language Models
    Pilault, Jonathan
    Li, Raymond
    Subramanian, Sandeep
    Pal, Christopher
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 9308 - 9319
  • [2] Query Oriented Extractive-Abstractive Summarization System (QEASS)
    Girthana, K.
    Swamynathan, S.
    [J]. PROCEEDINGS OF THE 6TH ACM IKDD CODS AND 24TH COMAD, 2019, : 301 - 305
  • [3] Extractive-Abstractive Summarization of Judgment Documents Using Multiple Attention Networks
    Gao, Yan
    Liu, Zhengtao
    Li, Juan
    Guo, Fan
    Xiao, Fei
    [J]. LOGIC AND ARGUMENTATION, CLAR 2021, 2021, 13040 : 486 - 494
  • [4] Extractive-Abstractive: A Two-Stage Model for Long Text Summarization
    Liang, Rui
    Li, Jianguo
    Huang, Li
    Lin, Ronghua
    Lai, Yu
    Xiong, Dan
    [J]. COMPUTER SUPPORTED COOPERATIVE WORK AND SOCIAL COMPUTING, CHINESECSCW 2021, PT II, 2022, 1492 : 173 - 184
  • [5] Question-driven text summarization using an extractive-abstractive framework
    Kia, Mahsa Abazari
    Garifullina, Aygul
    Kern, Mathias
    Chamberlain, Jon
    Jameel, Shoaib
    [J]. COMPUTATIONAL INTELLIGENCE, 2024, 40 (03)
  • [6] Automatic Multi-Document Summarization for Indonesian Documents Using Hybrid Abstractive-Extractive Summarization Technique
    Yapinus, Glorian
    Erwin, Alva
    Galinium, Maulahikmah
    Muliady, Wahyu
    [J]. 2014 6TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING (ICITEE), 2014, : 39 - 43
  • [7] A BERT based single document extractive summarization model
    Liu, Wei
    Song, Pei-Ran
    Jiao, Rui-Li
    [J]. Journal of Computers (Taiwan), 2020, 31 (02) : 241 - 249
  • [8] A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss
    Hsu, Wan-Ting
    Lin, Chieh-Kai
    Lee, Ming-Ying
    Min, Kerui
    Tang, Jing
    Sun, Min
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 132 - 141
  • [9] Integrating Extractive and Abstractive Models for Long Text Summarization
    Wang, Shuai
    Zhao, Xiang
    Li, Bo
    Ge, Bin
    Tang, Daquan
    [J]. 2017 IEEE 6TH INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS 2017), 2017, : 305 - 312
  • [10] A HYBRID APPROACH FOR EXTRACTIVE DOCUMENT SUMMARIZATION WITH BIG DATA ANALYTICS
    Vadivu, V.
    Kavitha, N.
    [J]. ADVANCES AND APPLICATIONS IN MATHEMATICAL SCIENCES, 2021, 20 (12): : 3351 - 3363