Optimizing Sentence Modeling and Selection for Document Summarization

被引:0
|
作者
Yin, Wenpeng [1 ]
Pei, Yulong [2 ]
机构
[1] Univ Munich, Ctr Informat & Language Proc, Munich, Germany
[2] Carnegie Mellon Univ, Sch Comp Sci, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Extractive document summarization aims to conclude given documents by extracting some salient sentences. Often, it faces two challenges: 1) how to model the information redundancy among candidate sentences; 2) how to select the most appropriate sentences. This paper attempts to build a strong summarizer DivSelect+CNNLM by presenting new algorithms to optimize each of them. Concretely, it proposes CNNLM, a novel neural network language model (NNLM) based on convolutional neural network (CNN), to project sentences into dense distributed representations, then models sentence redundancy by cosine similarity. Afterwards, it formulates the selection process as an optimization problem, constructing a diversified selection process (DivSelect) with the aim of selecting some sentences which have high prestige, meantime, are dis-similar with each other. Experimental results on DUC2002 and DUC2004 benchmark data sets demonstrate the effectiveness of our approach.
引用
收藏
页码:1383 / 1389
页数:7
相关论文
共 50 条
  • [41] User Intention-Based Document Summarization on Heterogeneous Sentence Networks
    Wang, Hsiu-Yi
    Chang, Jia-Wei
    Huang, Jen-Wei
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2019), PT II, 2019, 11447 : 572 - 587
  • [42] A preference learning approach to sentence ordering for multi-document summarization
    Bollegala, Danushka
    Okazaki, Naoaki
    Ishizuka, Mitsuru
    INFORMATION SCIENCES, 2012, 217 : 78 - 95
  • [43] Sentence extraction using time features in multi-document summarization
    Lim, JM
    Kang, IS
    Bae, JHJ
    Lee, JH
    INFORMATION RETRIEVAL TECHNOLOGY, 2005, 3411 : 82 - 93
  • [44] A Sentence Selection Model and HLO Algorithm for Extractive Text Summarization
    Alguliyev, Rasim
    Aliguliyev, Ramiz
    Isazade, Nijat
    2016 IEEE 10TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2016, : 206 - 209
  • [45] Neural sentence fusion for diversity driven abstractive multi-document summarization
    Fuad, Tanvir Ahmed
    Nayeem, Mir Tafseer
    Mahmud, Asif
    Chali, Yllias
    COMPUTER SPEECH AND LANGUAGE, 2019, 58 : 216 - 230
  • [46] Summarization as Feature Selection for Document Categorization on Small Datasets
    Anguiano-Hernandez, Emmanuel
    Villasenor-Pineda, Luis
    Montes-y-Gomez, Manuel
    Rosso, Paolo
    ADVANCES IN NATURAL LANGUAGE PROCESSING, 2010, 6233 : 39 - +
  • [47] Single-document and multi-document summarization techniques for email threads using sentence compression
    Zajic, David M.
    Dorr, Bonnie J.
    Lin, Jimmy
    INFORMATION PROCESSING & MANAGEMENT, 2008, 44 (04) : 1600 - 1610
  • [48] A fusion of variants of sentence scoring methods and collaborative word rankings for document summarization
    Verma, Pradeepika
    Verma, Anshul
    Pal, Sukomal
    EXPERT SYSTEMS, 2022, 39 (06)
  • [49] Sentence Similarity based on Dependency Tree Kernels for Multi-document Summarization
    Ozates, Saziye Betul
    Ozgur, Arzucan
    Radev, Dragomir R.
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 2833 - 2838
  • [50] EcForest: Extractive document summarization through enhanced sentence embedding and cascade forest
    Yang, Kang
    He, Hongye
    Al-Sabahi, Kamal
    Zhang, Zuping
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (17):