Leveraging Information Bottleneck for Scientific Document Summarization

被引:0
|
作者
Ju, Jiaxin [1 ]
Liu, Ming [2 ,3 ]
Koh, Huan Yee [1 ]
Jin, Yuan [1 ]
Du, Lan [1 ]
Pan, Shirui [1 ]
机构
[1] Monash Univ, Fac Informat Technol, Melbourne, Vic, Australia
[2] Deakin Univ, Sch Informat Technol, Geelong, Vic, Australia
[3] Zhongtukexin Co Ltd, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an unsupervised extractive approach to summarize scientific long documents based on the Information Bottleneck principle. Inspired by previous work which uses the Information Bottleneck principle for sentence compression, we extend it to document level summarization with two separate steps. In the first step, we use signal(s) as queries to retrieve the key content from the source document. Then, a pre-trained language model conducts further sentence search and edit to return the final extracted summaries. Importantly, our work can be flexibly extended to a multi-view framework by different signals. Automatic evaluation on three scientific document datasets verifies the effectiveness of the proposed framework. The further human evaluation suggests that the extracted summaries cover more content aspects than previous systems.
引用
收藏
页码:4091 / 4098
页数:8
相关论文
共 50 条
  • [41] Spoken document summarization using acoustic, prosodic and semantic information
    Huang, CL
    Hsieh, CH
    Wu, CH
    2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 434 - 437
  • [42] Extractive Multi-document Text Summarization Leveraging Hybrid Semantic Similarity Measures
    Bandaru, Rajesh
    Radhika, Dr. Y.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (09) : 844 - 852
  • [43] Event graphs for information retrieval and multi-document summarization
    Glavas, Goran
    Snajder, Jan
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (15) : 6904 - 6916
  • [44] Correlation Based Multi-Document Summarization for Scientific Articles and News Group
    Jayabharathy, J.
    Kanmani, S.
    Sivaranjani, N.
    PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI'12), 2012, : 1093 - 1099
  • [45] Scientific Document Summarization using Citation Context and Multi-objective Optimization
    Saini, Naveen
    Kumar, Sushil
    Saha, Sriparna
    Bhattacharyya, Pushpak
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4290 - 4295
  • [46] BottleSum: Unsupervised and Self-supervised Sentence Summarization using the Information Bottleneck Principle
    West, Peter
    Holtzman, Ari
    Buys, Jan
    Choi, Yejin
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 3752 - 3761
  • [47] Terms over LOAD: Leveraging Named Entities for Cross-Document Extraction and Summarization of Events
    Spitz, Andreas
    Gertz, Michael
    SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2016, : 503 - 512
  • [48] Enhancing abstractive summarization of scientific papers using structure information
    Bao, Tong
    Zhang, Heng
    Zhang, Chengzhi
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 261
  • [49] Structured abstract summarization of scientific articles: Summarization using full-text section information
    Oh, Hanseok
    Nam, Seojin
    Zhu, Yongjun
    JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 2023, 74 (02) : 234 - 248
  • [50] Chinese spoken document summarization using probabilistic latent topical information
    Chen, Berlin
    Yeh, Yao-Ming
    Huang, Yao-Min
    Chen, Yi-Ting
    2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 969 - 972