Feature-based Unsupervised Method for Salient Sentence Ranking in Text Summarization Task

被引:0
|
作者
Nguyen Minh Phuong [1 ]
Le The Anh [2 ]
机构
[1] Japan Adv Inst Sci & Technol, Nomi, Ishikawa, Japan
[2] FPT Univ, Can Tho, Vietnam
关键词
Unsupervised sentence scoring; salient sentence extraction; unsupervised multi-document summarization (mds);
D O I
10.1145/3654522.3654556
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Salient Sentence Ranking is an essential task that plays a vital role in Data Mining, especially in unsupervised document summarization tasks. In this paper, we introduce a simple yet effective unsupervised method to extract the salient sentences from a cluster of documents. Our method synthesizes the sentence scoring from various feature-based information containing position, topic, keyword, semantic, entity, sentence centroid -scores. The proposed method has the potential to generate large-scale pseudo-summary, which supports the tasks of summarization. To this end, our approach is able to incorporate pre-trained objectives used in pre-trained language models to diminish the problems of the lack of annotated datasets in low-resource languages like Vietnamese. We also conducted experiments to verify the effectiveness of various feature-based scoring methods and their combinations. Our experimental results on two well-known benchmark datasets, MultiNews and NewSHead, show the superiority of our proposed method compared with the previous unsupervised approaches.
引用
下载
收藏
页码:346 / 351
页数:6
相关论文
共 50 条
  • [31] An unsupervised method for extractive multi-document summarization based on centroid approach and sentence embeddings
    Lamsiyah, Salima
    El Mahdaouy, Abdelkader
    Espinasse, Bernard
    Ouatik, Said El Alaoui
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 167
  • [32] UNSUPERVISED FEATURE RANKING AND SELECTION BASED ON AUTOENCODERS
    Sharifipour, Sasan
    Fayyazi, Hossein
    Sabokrou, Mohammad
    Adeli, Ehsan
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3172 - 3176
  • [33] Information-content based sentence extraction for text summarization
    Mallett, D
    Elding, J
    Nascimento, MA
    ITCC 2004: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: CODING AND COMPUTING, VOL 2, PROCEEDINGS, 2004, : 214 - 218
  • [34] Improving Quality of Vietnamese Text Summarization Based on Sentence Compression
    Ha Nguyen Thi Thu
    Cuong Nguyen Ngoc
    Tu Nguyen Ngoc
    Hiep Xuan Huynh
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (02) : 362 - 366
  • [35] Key sentence based text summarization using Keywords and WordNet
    Dang, Chenghua
    Luo, Xinjun
    WSEAS Transactions on Computers, 2007, 6 (05): : 829 - 834
  • [36] An Additive FAHP Based Sentence Score Function for Text Summarization
    Guran, Aysun
    Uysal, Mitat
    Ekinci, Yeliz
    Guran, Celal Barkan
    INFORMATION TECHNOLOGY AND CONTROL, 2017, 46 (01): : 53 - 69
  • [37] A new sentence similarity measure and sentence based extractive technique for automatic text summarization
    Aliguliyev, Ramiz M.
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (04) : 7764 - 7772
  • [38] A statistically based sentence scoring method using mathematical combination for extractive Hindi text summarization
    Dhankhar, Sunil
    Gupta, Mukesh Kumar
    JOURNAL OF INTERDISCIPLINARY MATHEMATICS, 2022, 25 (03) : 773 - 790
  • [39] Feature-Based Subjectivity Classification of Filipino Text
    Regalado, Ralph Vincent J.
    Cheng, Charibeth K.
    2012 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2012), 2012, : 57 - 60
  • [40] CENTRANK: A GRAPH CENTROID BASED RANKING FOR EXTRACTIVE TEXT SUMMARIZATION
    Hemamalini, S.
    Swaminathan, V.
    TWMS JOURNAL OF APPLIED AND ENGINEERING MATHEMATICS, 2023, 13 : 355 - 364