Feature-based Unsupervised Method for Salient Sentence Ranking in Text Summarization Task

被引:0
|
作者
Nguyen Minh Phuong [1 ]
Le The Anh [2 ]
机构
[1] Japan Adv Inst Sci & Technol, Nomi, Ishikawa, Japan
[2] FPT Univ, Can Tho, Vietnam
关键词
Unsupervised sentence scoring; salient sentence extraction; unsupervised multi-document summarization (mds);
D O I
10.1145/3654522.3654556
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Salient Sentence Ranking is an essential task that plays a vital role in Data Mining, especially in unsupervised document summarization tasks. In this paper, we introduce a simple yet effective unsupervised method to extract the salient sentences from a cluster of documents. Our method synthesizes the sentence scoring from various feature-based information containing position, topic, keyword, semantic, entity, sentence centroid -scores. The proposed method has the potential to generate large-scale pseudo-summary, which supports the tasks of summarization. To this end, our approach is able to incorporate pre-trained objectives used in pre-trained language models to diminish the problems of the lack of annotated datasets in low-resource languages like Vietnamese. We also conducted experiments to verify the effectiveness of various feature-based scoring methods and their combinations. Our experimental results on two well-known benchmark datasets, MultiNews and NewSHead, show the superiority of our proposed method compared with the previous unsupervised approaches.
引用
下载
收藏
页码:346 / 351
页数:6
相关论文
共 50 条
  • [1] Comparison of feature-based sentence ranking methods for extractive summarization of Turkish news texts
    Erdagi, Erturk
    Tunali, Volkan
    SIGMA JOURNAL OF ENGINEERING AND NATURAL SCIENCES-SIGMA MUHENDISLIK VE FEN BILIMLERI DERGISI, 2024, 42 (02): : 321 - 334
  • [2] Discovering Chinese sentence patterns for feature-based opinion summarization
    Huang, Shiu-Li
    Cheng, Wen-Chi
    ELECTRONIC COMMERCE RESEARCH AND APPLICATIONS, 2015, 14 (06) : 582 - 591
  • [3] Feature Priority Based Sentence Filtering Method for Extractive Automatic Text Summarization
    Meena, Yogesh Kumar
    Gopalani, Dinesh
    INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND CONVERGENCE (ICCC 2015), 2015, 48 : 728 - 734
  • [4] Automated Bangla Text Summarization by Sentence Scoring and Ranking
    Efat, Md. Iftekharul Alam
    Ibrahim, Mohammad
    Kayesh, Humayun
    2013 INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS & VISION (ICIEV), 2013,
  • [5] Text Summarization by Sentence Extraction Using Unsupervised Learning
    Garcia-Hernandez, Rene Arnulfo
    Montiel, Romyna
    Ledeneva, Yulia
    Rendon, Erendira
    Gelbukh, Alexander
    Cruz, Rafael
    MICAI 2008: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, 5317 : 133 - +
  • [6] Unsupervised Extractive Text Summarization Using Frequency-Based Sentence Clustering
    Hajjar, Ali
    Tekli, Joe
    NEW TRENDS IN DATABASE AND INFORMATION SYSTEMS, ADBIS 2022, 2022, 1652 : 245 - 255
  • [7] An unsupervised semantic sentence ranking scheme for text documents
    Zhang, Hao
    Wang, Jie
    INTEGRATED COMPUTER-AIDED ENGINEERING, 2021, 28 (01) : 17 - 33
  • [8] Categorized Text Document Summarization in the Kannada Language by Sentence Ranking
    Jayashree, R.
    Murthy, Srikanta K.
    Anami, Basavaraj S.
    2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2012, : 776 - 781
  • [9] An Algebraic Approach for Sentence Based Feature Extraction Applied for Automatic Text Summarization
    Batcha, Nowshath Kadhar
    Aziz, Normaziah Abdul
    ADVANCED SCIENCE LETTERS, 2014, 20 (01) : 139 - 143
  • [10] A hybrid method of unsupervised feature selection based on ranking
    Li, Yun
    Lu, Bao-Liang
    Wu, Zhong-Fu
    18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS, 2006, : 687 - +