Leveraging Information Bottleneck for Scientific Document Summarization

被引:0
|
作者
Ju, Jiaxin [1 ]
Liu, Ming [2 ,3 ]
Koh, Huan Yee [1 ]
Jin, Yuan [1 ]
Du, Lan [1 ]
Pan, Shirui [1 ]
机构
[1] Monash Univ, Fac Informat Technol, Melbourne, Vic, Australia
[2] Deakin Univ, Sch Informat Technol, Geelong, Vic, Australia
[3] Zhongtukexin Co Ltd, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an unsupervised extractive approach to summarize scientific long documents based on the Information Bottleneck principle. Inspired by previous work which uses the Information Bottleneck principle for sentence compression, we extend it to document level summarization with two separate steps. In the first step, we use signal(s) as queries to retrieve the key content from the source document. Then, a pre-trained language model conducts further sentence search and edit to return the final extracted summaries. Importantly, our work can be flexibly extended to a multi-view framework by different signals. Automatic evaluation on three scientific document datasets verifies the effectiveness of the proposed framework. The further human evaluation suggests that the extracted summaries cover more content aspects than previous systems.
引用
收藏
页码:4091 / 4098
页数:8
相关论文
共 50 条
  • [31] The Repository of Web Document Summarization using Social Information
    Minh-Tien Nguyen
    Van-Hau Nguyen
    Duc-Vu Tran
    PROCEEDINGS OF 2019 11TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2019), 2019, : 445 - 449
  • [32] Weakly-Supervised Opinion Summarization by Leveraging External Information
    Zhao, Chao
    Chaturvedi, Snigdha
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9644 - 9651
  • [33] A Multi-level Annotated Corpus of Scientific Papers for Scientific Document Summarization and Cross-document Relation Discovery
    AbuRa'ed, Ahmed
    Saggion, Horacio
    Chiruzzo, Luis
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6672 - 6679
  • [34] Scientific Literature Summarization Using Document Structure and Hierarchical Attention Model
    Xu, Huiyan
    Wang, Zhijian
    Weng, Xiaolan
    IEEE ACCESS, 2019, 7 : 185290 - 185300
  • [35] A binary grey wolf optimizer to solve the scientific document summarization problem
    Das, Ranjita
    Debnath, Dipanwita
    Pakray, Partha
    Kumar, Naga Chaitanya
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (8) : 23737 - 23759
  • [36] SumSurvey: An Abstractive Dataset of Scientific Survey Papers for Long Document Summarization
    Liu, Ran
    Liu, Ming
    Yu, Min
    Zhang, He
    Jiang, Jianguo
    Li, Gang
    Huang, Weiqing
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 9632 - 9651
  • [37] Deep Learning-Based Scientific Document Summarization Considering Citation
    Divya Jyoti
    Dharmendra Prasad Mahato
    Jyoti Srivastava
    SN Computer Science, 6 (4)
  • [38] A binary grey wolf optimizer to solve the scientific document summarization problem
    Ranjita Das
    Dipanwita Debnath
    Partha Pakray
    Naga Chaitanya Kumar
    Multimedia Tools and Applications, 2024, 83 : 23737 - 23759
  • [39] Leveraging peer-review aspects for extractive and abstractive summarization of scientific articles
    Majadly, Muhammad
    Last, Mark
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, : 537 - 550
  • [40] Scientific information document Helicobacter pylori
    Tennant, B
    JOURNAL OF SMALL ANIMAL PRACTICE, 1996, 37 (12) : 609 - 610