Leveraging Information Bottleneck for Scientific Document Summarization

被引:0
|
作者
Ju, Jiaxin [1 ]
Liu, Ming [2 ,3 ]
Koh, Huan Yee [1 ]
Jin, Yuan [1 ]
Du, Lan [1 ]
Pan, Shirui [1 ]
机构
[1] Monash Univ, Fac Informat Technol, Melbourne, Vic, Australia
[2] Deakin Univ, Sch Informat Technol, Geelong, Vic, Australia
[3] Zhongtukexin Co Ltd, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an unsupervised extractive approach to summarize scientific long documents based on the Information Bottleneck principle. Inspired by previous work which uses the Information Bottleneck principle for sentence compression, we extend it to document level summarization with two separate steps. In the first step, we use signal(s) as queries to retrieve the key content from the source document. Then, a pre-trained language model conducts further sentence search and edit to return the final extracted summaries. Importantly, our work can be flexibly extended to a multi-view framework by different signals. Automatic evaluation on three scientific document datasets verifies the effectiveness of the proposed framework. The further human evaluation suggests that the extracted summaries cover more content aspects than previous systems.
引用
收藏
页码:4091 / 4098
页数:8
相关论文
共 50 条
  • [1] Leveraging Word Embeddings for Spoken Document Summarization
    Chen, Kuan-Yu
    Liu, Shih-Hung
    Wang, Hsin-Min
    Chen, Berlin
    Chen, Hsin-Hsi
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1383 - 1387
  • [2] SURVEY OF SCIENTIFIC DOCUMENT SUMMARIZATION METHODS
    Kurian, Sheena K.
    Mathew, Sheena
    COMPUTER SCIENCE-AGH, 2020, 21 (02): : 139 - +
  • [3] Leveraging Document-Specific Information for Classifying Relations in Scientific Articles
    Dai, Qin
    Inoue, Naoya
    Reisert, Paul
    Inui, Kentaro
    NEW FRONTIERS IN ARTIFICIAL INTELLIGENCE (JSAI-ISAI 2017), 2018, 10838 : 355 - 370
  • [4] MuP-SciDocSum: Leveraging Multi-perspective Peer Review Summaries for Scientific Document Summarization
    Kumar, Sandeep
    Kohli, Guneet Singh
    Ghosal, Tirthankar
    Ekbal, Asif
    LEVERAGING GENERATIVE INTELLIGENCE IN DIGITAL LIBRARIES: TOWARDS HUMAN-MACHINE COLLABORATION, ICADL 2023, PT II, 2023, 14458 : 250 - 267
  • [5] Scientific document summarization via citation contextualization and scientific discourse
    Cohan, Arman
    Goharian, Nazli
    INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES, 2018, 19 (2-3) : 287 - 303
  • [6] Section mixture models for scientific document summarization
    Conroy, John M.
    Davis, Sashka T.
    INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES, 2018, 19 (2-3) : 305 - 322
  • [7] Video Summarization Leveraging Multimodal Information for Presentations
    Liu, Hanchao
    Chen, Dapeng
    Li, Rongjun
    Xue, Wenyuan
    Peng, Wei
    INTERSPEECH 2023, 2023, : 5251 - 5252
  • [8] Leveraging Salience Analysis and Sparse Attention for Long Document Summarization
    Jiang, Zhihua
    Chen, Yaxuan
    Rao, Dongning
    PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2023, 2023, : 44 - 50
  • [9] Leveraging Non-negative Matrix Factorization for Document Summarization
    Khurana, Alka
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT III, 2022, : 544 - 548
  • [10] Disentangling Instructive Information from Ranked Multiple Candidates for Multi-Document Scientific Summarization
    Wang, Pancheng
    Li, Shasha
    Li, Dong
    Long, Kehan
    Tang, Jintao
    Wang, Ting
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 2028 - 2037