Effective aggregation of various summarization techniques

被引:31
|
作者
Mehta, Parth [1 ]
Majumder, Prasenjit [2 ]
机构
[1] Dhirubhai Ambani Inst Informat & Commun Technol, Near Indroda Circle, Gandhinagar 382007, Gujarat, India
[2] Phirlthhai Ambani Inst informat & Commun Technol, 4209\ FB-4,DA IICT,Near Indroda Circle, Gandhinagar 382007, Gujarat, India
关键词
Summarization; Ensemble; SENTENCE; EXTRACTION;
D O I
10.1016/j.ipm.2017.11.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A large number of extractive summarization techniques have been developed in the past decade, but very few enquiries have been made as to how these differ from each other or what are the factors that actually affect these systems. Such meaningful comparison if available can be used to create a robust ensemble of these approaches, which has the possibility to consistently outperform each individual summarization system. In this work we examine the roles of three principle components of an extractive summarization technique: sentence ranking algorithm, sentence similarity metric and text representation scheme. We show that using a combination of several different sentence similarity measures, rather than only one, significantly improves performance of the resultant meta-system. Even simple ensemble techniques, when used in an informed manner, prove to be very effective in improving the overall performance and consistency of summarization systems. A statistically significant improvement of about 5% to 10% in ROUGE-1 recall was achieved by aggregating various sentence similarity measures. As opposed to this aggregation of several ranking algorithms did not show a significant improvement in ROUGE score, but even in this case the resultant meta-systems were more robust than candidate systems. The results suggest that new extractive summarization techniques should particularly focus on defining a better sentence similarity metric and use multiple sentence similarity scores and ranking algorithms in favour of a particular combination.
引用
收藏
页码:145 / 158
页数:14
相关论文
共 50 条
  • [41] A Survey on Domain-Specific Summarization Techniques
    Rajan, Reshmi P.
    Jose, Deepa, V
    ADVANCES IN DATA AND INFORMATION SCIENCES, 2022, 318 : 351 - 361
  • [42] Segmentation techniques for the summarization of individual mobility data
    Damiani, Maria Luisa
    Hachem, Fatima
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2017, 7 (06)
  • [43] A Comparison of Methods for the Evaluation of Text Summarization Techniques
    Barbella, Marcello
    Risi, Michele
    Tortora, Genoveffa
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON DATA SCIENCE, TECHNOLOGY AND APPLICATIONS (DATA), 2021, : 200 - 207
  • [44] Review of automatic text summarization techniques & methods
    Widyassari, Adhika Pramita
    Rustad, Supriadi
    Shidik, Guruh Fajar
    Noersasongko, Edi
    Syukur, Abdul
    Affandy, Affandy
    Setiadi, De Rosal Ignatius Moses
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2020, 34 (04) : 1029 - 1046
  • [45] Automatic Text Summarization Techniques Used in Industry
    Kharita, Mukesh Kumar
    Singh, Pardeep
    PROCEEDINGS OF ICETIT 2019: EMERGING TRENDS IN INFORMATION TECHNOLOGY, 2020, 605 : 227 - 235
  • [46] A Study on Abstractive Summarization Techniques in Indian Languages
    Sunitha, C.
    Jaya, A.
    Ganesh, Amal
    FOURTH INTERNATIONAL CONFERENCE ON RECENT TRENDS IN COMPUTER SCIENCE & ENGINEERING (ICRTCSE 2016), 2016, 87 : 25 - 31
  • [47] RISE: Leveraging Retrieval Techniques for Summarization Evaluation
    Uthus, David
    Ni, Jianmo
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 13697 - 13709
  • [48] Survey of Compressed Domain Video Summarization Techniques
    Basavarajaiah, Madhushree
    Sharma, Priyanka
    ACM COMPUTING SURVEYS, 2020, 52 (06)
  • [49] Leveraging summarization techniques in educational technology systems
    Benedetto, Irene
    Canale, Lorenzo
    Farinetti, Laura
    Cagliero, Luca
    La Quatra, Moreno
    2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 415 - 416
  • [50] Into Summarization Techniques for IoT Data Discovery Routing
    Hieu Tran
    Son Nguyen
    Yen, I-Ling
    Bastani, Farokh
    2021 IEEE 14TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD 2021), 2021, : 96 - 105