Random Indexing and Modified Random Indexing based approach for extractive text summarization

被引:8
|
作者
Chatterjee, Niladri [1 ]
Sahoo, Pramod Kumar [1 ,2 ]
机构
[1] Indian Inst Technol Delhi, Dept Math, New Delhi 110016, India
[2] Def Res & Dev Org, Inst Syst Studies & Anal, Delhi 110054, India
来源
COMPUTER SPEECH AND LANGUAGE | 2015年 / 29卷 / 01期
关键词
Word Space Model; Random Indexing; PageRank; Convolution; Modified Random Indexing; INFORMATION;
D O I
10.1016/j.csl.2014.07.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Random Indexing based extractive text summarization has already been proposed in literature. This paper looks at the above technique in detail, and proposes several improvements. The improvements are both in terms of formation of index (word) vectors of the document, and construction of context vectors by using convolution instead of addition operation on the index vectors. Experiments have been conducted using both angular and linear distances as metrics for proximity. As a consequence, three improved versions of the algorithm, viz. RISUM, RISUM+ and MRISUM were obtained. These algorithms have been applied on DUC 2002 documents, and their comparative performance has been studied. Different ROUGE metrics have been used for performance evaluation. While RISUM and RISUM+ perform almost at par, MRISUM is found to outperform both RISUM and RISUM+ significantly. MRISUM also outperforms LSA+TRM based summarization approach. The study reveals that all the three Random Indexing based techniques proposed in this study produce consistent results when linear distance is used for measuring proximity. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:32 / 44
页数:13
相关论文
共 50 条
  • [1] SINGLE DOCUMENT TEXT SUMMARIZATION USING RANDOM INDEXING AND NEURAL NETWORKS
    Chatterjee, Niladri
    Bhardwaj, Avikant
    KEOD 2010: Proceedings of the International Conference on Knowledge Engineering and Ontology Development, 2010, : 171 - 176
  • [2] Hybrid Latent Semantic Analysis and Random Indexing Model for Text Summarization
    Chatterjee, Niladri
    Yadav, Nidhika
    INFORMATION AND COMMUNICATION TECHNOLOGY FOR COMPETITIVE STRATEGIES, 2019, 40 : 149 - 156
  • [3] Lightweight Random Indexing for Polylingual Text Classification
    Fernandez, Alejandro Moreo
    Esuli, Andrea
    Sebastiani, Fabrizio
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2016, 57 : 151 - 185
  • [4] Extraction-based single-document summarization using random indexing
    Chattejee, Niladri
    Mohan, Shiwali
    19TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, VOL II, PROCEEDINGS, 2007, : 448 - +
  • [5] Automatic text summarization based on latent semantic indexing
    Ai, Dongmei
    Zheng, Yuchao
    Zhang, Dezheng
    ARTIFICIAL LIFE AND ROBOTICS, 2010, 15 (01) : 25 - 29
  • [6] Fast latent semantic indexing in text processing based on random mapping
    Qian, Xiao-Dong
    Wang, Zheng-Ou
    Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/Journal of Tianjin University Science and Technology, 2005, 38 (04): : 372 - 376
  • [7] Random indexing of text samples for latent semantic analysis
    Kanerva, P
    Kristoferson, J
    Holst, H
    PROCEEDINGS OF THE TWENTY-SECOND ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, 2000, : 1036 - 1036
  • [8] Random Indexing Revisited
    QasemiZadeh, Behrang
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2015, 2015, 9103 : 437 - 442
  • [9] Discovering word senses from text using random indexing
    Chatterjee, Niladri
    Mohan, Shiwali
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2008, 4919 : 299 - +
  • [10] Lightweight Random Indexing for Polylingual Text Classification (Extended Abstract)
    Moreo Fernandez, Alejandro
    Esuli, Andrea
    Sebastiani, Fabrizio
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 5642 - 5646