Optimizing Sentence Modeling and Selection for Document Summarization

被引:0
|
作者
Yin, Wenpeng [1 ]
Pei, Yulong [2 ]
机构
[1] Univ Munich, Ctr Informat & Language Proc, Munich, Germany
[2] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extractive document summarization aims to conclude given documents by extracting some salient sentences. Often, it faces two challenges: 1) how to model the information redundancy among candidate sentences; 2) how to select the most appropriate sentences. This paper attempts to build a strong summarizer DivSelect+CNNLM by presenting new algorithms to optimize each of them. Concretely, it proposes CNNLM, a novel neural network language model (NNLM) based on convolutional neural network (CNN), to project sentences into dense distributed representations, then models sentence redundancy by cosine similarity. Afterwards, it formulates the selection process as an optimization problem, constructing a diversified selection process (DivSelect) with the aim of selecting some sentences which have high prestige, meantime, are dis-similar with each other. Experimental results on DUC2002 and DUC2004 benchmark data sets demonstrate the effectiveness of our approach.
引用
收藏
页码:1383 / 1389
页数:7
相关论文
共 50 条
  • [31] Automatic Summarization of Polish News Articles by Sentence Selection
    Jassem, Krzysztof
    Pawluczuk, Lukasz
    PROCEEDINGS OF THE 2015 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2015, 5 : 337 - 341
  • [32] An approach to sentence-selection-based text summarization
    Chen, F
    Han, KS
    Chen, GL
    2002 IEEE REGION 10 CONFERENCE ON COMPUTERS, COMMUNICATIONS, CONTROL AND POWER ENGINEERING, VOLS I-III, PROCEEDINGS, 2002, : 489 - 493
  • [33] Text Summarization Based on Sentence Selection with Semantic Representation
    Zhang, Chi
    Zhang, Lei
    Wang, Chong-Jun
    Xie, Jun-Yuan
    2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, : 584 - 590
  • [34] Cohesion-based Sentence Ordering for Multi-document Summarization
    Jiang, Xiaoyu
    2016 INTERNATIONAL CONFERENCE ON INFORMATION ENGINEERING AND COMMUNICATIONS TECHNOLOGY (IECT 2016), 2016, : 78 - 83
  • [35] Automated Bengali Document Summarization By Collaborating Individual Word & Sentence Scoring
    Chandro, Porimol
    Arif, Md Faizul Huq
    Rahman, Md Mahbubur
    Siddik, Md Saeed
    Rahman, Mohammad Sayeedur
    Rahman, Md Abdur
    2018 21ST INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2018,
  • [36] SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression
    Zhao, Jinming
    Liu, Ming
    Gao, Longxiang
    Jin, Yuan
    Du, Lan
    Zhao, He
    Zhang, He
    Haffari, Gholamreza
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1949 - 1952
  • [37] Tiered sentence based topic model for multi-document summarization
    Akhtar, Nadeem
    Beg, M. M. Sufyan
    Javed, Hira
    Hussain, Md Muzakkir
    JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2022, 43 (08): : 2131 - 2141
  • [38] A hybrid model for sentence ordering in extractive multi-document summarization
    Liu, Dexi
    Zhang, Zengchang
    He, Yanxiang
    Ji, Donghong
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2006, 4182 : 588 - 592
  • [39] Enhanced automatic abstractive document summarization using transformers and sentence grouping
    Toprak, Ahmet
    Turan, Metin
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (04):
  • [40] Subtopic-focused sentence scoring in multi-document summarization
    Li Sujian
    Qu Weiguang
    ALPIT 2007: PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, 2007, : 98 - +