Optimizing Sentence Modeling and Selection for Document Summarization

被引:0
|
作者
Yin, Wenpeng [1 ]
Pei, Yulong [2 ]
机构
[1] Univ Munich, Ctr Informat & Language Proc, Munich, Germany
[2] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extractive document summarization aims to conclude given documents by extracting some salient sentences. Often, it faces two challenges: 1) how to model the information redundancy among candidate sentences; 2) how to select the most appropriate sentences. This paper attempts to build a strong summarizer DivSelect+CNNLM by presenting new algorithms to optimize each of them. Concretely, it proposes CNNLM, a novel neural network language model (NNLM) based on convolutional neural network (CNN), to project sentences into dense distributed representations, then models sentence redundancy by cosine similarity. Afterwards, it formulates the selection process as an optimization problem, constructing a diversified selection process (DivSelect) with the aim of selecting some sentences which have high prestige, meantime, are dis-similar with each other. Experimental results on DUC2002 and DUC2004 benchmark data sets demonstrate the effectiveness of our approach.
引用
收藏
页码:1383 / 1389
页数:7
相关论文
共 50 条
  • [21] Candidate sentence selection for extractive text summarization
    Mutlu, Begum
    Sezer, Ebru A.
    Akcayol, M. Ali
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (06)
  • [22] A topic Approach to Sentence Ordering for Multi-document Summarization
    Na, Liu
    Peng, Xiao
    Ying, Lu
    Tang Xiao-jun
    Wang Hai-wen
    Li Ming-xia
    2016 IEEE TRUSTCOM/BIGDATASE/ISPA, 2016, : 1390 - 1395
  • [23] Joint Hierarchical Semantic Clipping and Sentence Extraction for Document Summarization
    Yan, Wanying
    Guo, Junjun
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2020, 16 (04): : 820 - 831
  • [24] Multi-document Text Summarization Using Sentence Extraction
    Ahuja, Ravinder
    Anand, Willson
    ARTIFICIAL INTELLIGENCE AND EVOLUTIONARY COMPUTATIONS IN ENGINEERING SYSTEMS, ICAIECES 2016, 2017, 517 : 235 - 242
  • [25] A hybrid sentence ordering strategy in multi-document summarization
    He, Yanxiang
    Liu, Dexi
    Yang, Hua
    Ji, Donghong
    Teng, Chong
    Qi, Wenqing
    WEB INFORMATION SYSTEMS - WISE 2006, PROCEEDINGS, 2006, 4255 : 339 - 349
  • [26] Categorized Text Document Summarization in the Kannada Language by Sentence Ranking
    Jayashree, R.
    Murthy, Srikanta K.
    Anami, Basavaraj S.
    2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2012, : 776 - 781
  • [27] Sentence Reduction Algorithms to Improve Multi-document Summarization
    Silveira, Sara Botelho
    Branco, Antonio
    AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2013, 2014, 449 : 261 - 276
  • [28] An adjacency model for sentence ordering in multi-document summarization
    Nie, Yu
    Ji, Donghong
    Yang, Lingpeng
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2006, 4182 : 313 - 322
  • [29] Optimizing Text Summarization with Sentence Clustering and Natural Language Processing
    Edress, Zahir
    Ortakci, Yasin
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (10) : 1123 - 1132
  • [30] Multi-document Text Summarization: SimWithFirst Based Features and Sentence Co-selection Based Evaluation
    Ali, Md. Mohsin
    Ghosh, Monotosh Kumar
    Abdullah-Al-Mamun
    INTERNATIONAL CONFERENCE ON FUTURE COMPUTER AND COMMUNICATIONS, PROCEEDINGS, 2009, : 93 - 96