Using Centroid Keywords and Word Mover's Distance for Single Document Extractive Summarization

被引:1
|
作者
Seitkali, Dauken [1 ]
Mussabayev, Rustam [1 ]
机构
[1] Inst Informat & Computat Technol, 125 Pushkina, Alma Ata 050010, Kazakhstan
关键词
Centroid; WMD; word2vec; extractive summarization;
D O I
10.1145/3342827.3342852
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents unsupervised method of single document extractive summarization. The main idea behind the method is in selecting sentences based on Word Mover's Distance Similarity between each sentence and set of centroid keywords. This approach leverages both compositional property of word embeddings and advantages of recently discovered powerful text to text distance metric. ROUGE results on DUC 2002 data set showed that quality of produced summaries can compete with well-known state of the art systems. In this work we also discuss limitations of gold summaries in evaluating quality of summarization systems.
引用
收藏
页码:149 / 152
页数:4
相关论文
共 50 条
  • [21] Topic Modeling on User Stories using Word Mover's Distance
    Guelle, Kim Julian
    Ford, Nicholas
    Ebel, Patrick
    Brokhausen, Florian
    Vogelsang, Andreas
    [J]. 2020 IEEE SEVENTH INTERNATIONAL WORKSHOP ON ARTIFICIAL INTELLIGENCE FOR REQUIREMENTS ENGINEERING (AIRE 2020), 2020, : 52 - 60
  • [22] Extractive Text Summarization using Word Vector Embedding
    Jain, Aditya
    Bhatia, Divij
    Thakur, Manish K.
    [J]. 2017 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND DATA SCIENCE (MLDS 2017), 2017, : 51 - 55
  • [23] Gibberish, Assistant, or Master? Using Tweets Linking to News for Extractive Single -Document Summarization
    Wei, Zhongyu
    Gao, Wei
    [J]. SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2015, : 1003 - 1006
  • [24] Text document summarization using word embedding
    Mohd, Mudasir
    Jan, Rafiya
    Shah, Muzaffar
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 143 (143)
  • [25] Topic Mover's Distance Based Document Classification
    Wu, Xinhui
    Li, Hui
    [J]. 2017 17TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT 2017), 2017, : 1998 - 2002
  • [26] A topic modeled unsupervised approach to single document extractive text summarization
    Srivastava, Ridam
    Singh, Prabhav
    Rana, K. P. S.
    Kumar, Vineet
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 246
  • [27] Extractive Multi-document Summarization using K-means, Centroid-based Method, MMR, and Sentence Position
    Hai Cao Manh
    Huong Le Thanh
    Tuan Luu Minh
    [J]. SOICT 2019: PROCEEDINGS OF THE TENTH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY, 2019, : 29 - 35
  • [28] Analyzing Preprocessing Settings for Urdu Single-document Extractive Summarization
    Humayoun, Muhammad
    Yu, Hwanjo
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3686 - 3693
  • [29] Graph-based extractive text summarization based on single document
    Avaneesh Kumar Yadav
    Rama Shankar Ranvijay
    Ashish Kumar Yadav
    [J]. Multimedia Tools and Applications, 2024, 83 : 18987 - 19013
  • [30] Multi-document extractive summarization using semantic graph
    del Camino Valle, Oleyda
    Simon-Cuevas, Alfredo
    Valladares-Valdes, Eduardo
    Olivas, Jose A.
    Romero, Francisco P.
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2019, (63): : 103 - 110