Query-oriented unsupervised multi-document summarization via deep learning model

被引:55
|
作者
Zhong, Sheng-hua [1 ,2 ]
Liu, Yan [2 ]
Li, Bin [3 ]
Long, Jing [4 ]
机构
[1] Shen Zhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Kowloon 999077, Hong Kong, Peoples R China
[3] City Univ Hong Kong, Dept Linguist & Translat, Kowloon 999077, Hong Kong, Peoples R China
[4] Nanjing Univ, Sch Business, Nanjing 210093, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Query-oriented summarization; Multi-document; Neocortex simulation; INFERENCE; ALGORITHM;
D O I
10.1016/j.eswa.2015.05.034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Capturing the compositional process from words to documents is a key challenge in natural language processing and information retrieval: Extractive style query-oriented multi-document summarization generates a summary by extracting a proper set of sentences from multiple documents based on pre-given query. This paper proposes a novel document summarization framework based on deep learning model, which has been shown outstanding extraction ability in many real-world applications. The framework consists of three parts: concepts extraction, summary generation, and reconstruction validation. A new query-oriented extraction technique is proposed to extract information distributed in multiple documents. Then, the whole deep architecture is fine-tuned by minimizing the information loss in reconstruction validation. According to the concepts extracted from deep architecture layer by layer, dynamic programming is used to seek most informative set of sentences for the summary. Experiment on three benchmark datasets (DUC 2005, 2006, and 2007) assess and confirm the effectiveness of the proposed framework and algorithms. Experiment results show that the proposed method outperforms state-of-the-art extractive summarization approaches. Moreover, we also provide the statistical analysis of query words based on Amazon's Mechanical Turk (MTurk) crowdsourcing platform. There exists underlying relationships from topic words to the content which can contribute to summarization task. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:8146 / 8155
页数:10
相关论文
共 50 条
  • [1] Query-oriented Unsupervised Multi-document Summarization on Big Data
    Sunaina
    Kamath, Sowmya S.
    [J]. 7TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT 2016), 2016,
  • [2] A cluster-sensitive graph model for query-oriented multi-document summarization
    Wei, Furu
    Li, Wenjie
    Lu, Qin
    He, Yanxiang
    [J]. ADVANCES IN INFORMATION RETRIEVAL, 2008, 4956 : 446 - +
  • [3] A Query-Sensitive Graph-Based Sentence Ranking Algorithm for Query-Oriented Multi-Document Summarization .
    Wei, Furu
    He, Yanxiang
    Li, Wenjie
    Lu, Qin
    [J]. 2008 INTERNATIONAL SYMPOSIUM ON INFORMATION PROCESSING AND 2008 INTERNATIONAL PACIFIC WORKSHOP ON WEB MINING AND WEB-BASED APPLICATION, 2008, : 9 - +
  • [4] Multi-document Summarization via Deep Learning Techniques: A Survey
    Ma, Congbo
    Zhang, Wei Emma
    Guo, Mingyu
    Wang, Hu
    Sheng, Quan Z.
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (05)
  • [5] Multi-Document Extractive Text Summarization via Deep Learning Approach
    Rezaei, Afsaneh
    Dami, Sina
    Daneshjoo, Parisa
    [J]. 2019 IEEE 5TH CONFERENCE ON KNOWLEDGE BASED ENGINEERING AND INNOVATION (KBEI 2019), 2019, : 680 - 685
  • [6] Unsupervised Multi-document Summarization with Holistic Inference
    Zhang, Haopeng
    Cho, Sangwoo
    Song, Kaiqiang
    Wang, Xiaoyang
    Wang, Hongwei
    Zhang, Jiawei
    Yu, Dong
    [J]. 13TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING AND THE 3RD CONFERENCE OF THE ASIA-PACIFIC CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, IJCNLP-AACL 2023, 2023, : 123 - 133
  • [7] Deep Learning in the Domain of Multi-Document Text Summarization
    Roul, Rajendra Kumar
    Sahoo, Jajati Keshari
    Goel, Rohan
    [J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2017, 2017, 10597 : 575 - 581
  • [8] Multi-document summarization based on unsupervised clustering
    Ji, Paul
    [J]. INFORMATION RETRIEVAL TECHNOLOLGY, PROCEEDINGS, 2006, 4182 : 560 - 566
  • [9] MeanSum : A Neural Model for Unsupervised Multi-Document Abstractive Summarization
    Chu, Eric
    Liu, Peter J.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [10] Multi-document summarization via group sparse learning
    He, Ruifang
    Tang, Jiliang
    Gong, Pinghua
    Hu, Qinghua
    Wang, Bo
    [J]. INFORMATION SCIENCES, 2016, 349 : 12 - 24