Leveraging Information Bottleneck for Scientific Document Summarization

被引:0
|
作者
Ju, Jiaxin [1 ]
Liu, Ming [2 ,3 ]
Koh, Huan Yee [1 ]
Jin, Yuan [1 ]
Du, Lan [1 ]
Pan, Shirui [1 ]
机构
[1] Monash Univ, Fac Informat Technol, Melbourne, Vic, Australia
[2] Deakin Univ, Sch Informat Technol, Geelong, Vic, Australia
[3] Zhongtukexin Co Ltd, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an unsupervised extractive approach to summarize scientific long documents based on the Information Bottleneck principle. Inspired by previous work which uses the Information Bottleneck principle for sentence compression, we extend it to document level summarization with two separate steps. In the first step, we use signal(s) as queries to retrieve the key content from the source document. Then, a pre-trained language model conducts further sentence search and edit to return the final extracted summaries. Importantly, our work can be flexibly extended to a multi-view framework by different signals. Automatic evaluation on three scientific document datasets verifies the effectiveness of the proposed framework. The further human evaluation suggests that the extracted summaries cover more content aspects than previous systems.
引用
收藏
页码:4091 / 4098
页数:8
相关论文
共 50 条
  • [21] Scientific information document - Tuberculosis
    不详
    JOURNAL OF SMALL ANIMAL PRACTICE, 1999, 40 (03) : 145 - 147
  • [22] Semantic Representation and Attention Alignment for Graph Information Bottleneck in Video Summarization
    Zhong, Rui
    Wang, Rui
    Yao, Wenjin
    Hu, Min
    Dong, Shi
    Munteanu, Adrian
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4170 - 4184
  • [23] Text Summarization towards Scientific Information Extraction
    Keller, Abigail
    Furst, Jacob
    Raicu, Daniela
    Hastings, Peter
    Tchoua, Roselyne
    2022 IEEE 18TH INTERNATIONAL CONFERENCE ON E-SCIENCE (ESCIENCE 2022), 2022, : 225 - 235
  • [24] An Empirical Assessment of Citation Information in Scientific Summarization
    Ronzano, Francesco
    Saggion, Horacio
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2016, 2016, 9612 : 318 - 325
  • [25] Multi-document summarization for terrorism information extraction
    Wang, Fu Lee
    Yang, Christopher C.
    Shi, Xiaodong
    INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2006, 3975 : 602 - 608
  • [26] Improving Document Summarization by Incorporating Social Contextual Information
    Hu, Po
    Ji, Donghong
    Sun, Cheng
    Teng, Chong
    Zhang, Yong
    INFORMATION RETRIEVAL TECHNOLOGY, 2011, 7097 : 499 - 508
  • [27] Personalized Multi-Document Summarization in information retrieval
    Yang, Xiao-Peng
    Liu, Xiao-Rong
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 4108 - +
  • [28] Improving Abstractive Document Summarization with Salient Information Modeling
    You, Yongjian
    Jia, Weijia
    Liu, Tianyi
    Yang, Wenmian
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2132 - 2141
  • [29] Document Summarization and Information Extraction for Generation of Presentation Slides
    Prasad, K. Gokul
    Mathivanan, Harish
    Jayaprakasam, Madan
    Geetha, T. V.
    2009 INTERNATIONAL CONFERENCE ON ADVANCES IN RECENT TECHNOLOGIES IN COMMUNICATION AND COMPUTING (ARTCOM 2009), 2009, : 126 - 128
  • [30] Multi-document summarization as applied in information retrieval
    Zhou, Dan
    Li, Lei
    PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07), 2007, : 203 - +