Multi-document extractive text summarization: A comparative assessment on features

被引:23
|
作者
Mutlu, Begum [1 ]
Sezer, Ebru A. [2 ]
Akcayol, M. Ali [1 ]
机构
[1] Gazi Univ, Dept Comp Engn, TR-06570 Ankara, Turkey
[2] Hacettepe Univ, Dept Comp Engn, TR-06800 Ankara, Turkey
关键词
Multi-document text summarization; Feature space; Sentence scoring-extraction; Artificial neural network; Fuzzy inference system; SENTENCE SCORING TECHNIQUES; FUZZY RULES;
D O I
10.1016/j.knosys.2019.07.019
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text summarization is the process of generating a brief version of a text that preserves the salient information of the text. For information retrieval, it is a good dimension reduction solution. In addition, it reduces the required reading time. This study focused on extracting informative summaries from multiple documents using commonly used hand-crafted features from the literature. The first investigation focused on the generation of a feature vector. The features were the number of sentences, term frequency, similarity with the title, term frequency-inverse sentence frequency, sentence position, sentence length, sentence-sentence similarity, bushy-path results, phrases of the sentence, proper nouns, n-gram co-occurrence, and length of the document. Secondly, several combinations of these features were examined and a shallow multi-layer perceptron and two differently modeled fuzzy inference systems were used to extract salient sentences from texts in the Document Understanding Conference (DUC) dataset. The summarization performances of these models were evaluated using original classification performance metrics, and recall-oriented understudy for gisting evaluation (ROUGE)-n. This study recommended the use of fuzzy systems based on a feature vector and a fuzzy rule set for extractive text summarization. The extraction methods were evaluated against a changing compression ratio. Results of experiments showed that the implemented neural model tended to incorrectly infer sentences that were not considered salient by human annotators. However, for distinguishing between summary-worthy and summary-unworthy sentences, the fuzzy inference systems performed better than the utilized neural network, as well as better than the existing fuzzy inference-based text summarization approaches in the literature. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Survey on Extractive Text Summarization Methods with Multi-Document Datasets
    Varalakshmi, P. N. K.
    Kallimani, Jagadish S.
    [J]. 2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 2113 - 2119
  • [2] Multi-document extractive text summarization based on firefly algorithm
    Tomer, Minakshi
    Kumar, Manoj
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (08) : 6057 - 6065
  • [3] Unsupervised extractive multi-document text summarization using a Genetic Algorithm
    Neri-Mendoza, Veronica
    Ledeneva, Yulia
    Garcia-Hernandez, Rene Arnulfo
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 2397 - 2408
  • [4] Extractive multi-document text summarization based on graph independent sets
    Uckan, Taner
    Karci, Ali
    [J]. EGYPTIAN INFORMATICS JOURNAL, 2020, 21 (03) : 145 - 157
  • [5] Multi-Document Extractive Text Summarization via Deep Learning Approach
    Rezaei, Afsaneh
    Dami, Sina
    Daneshjoo, Parisa
    [J]. 2019 IEEE 5TH CONFERENCE ON KNOWLEDGE BASED ENGINEERING AND INNOVATION (KBEI 2019), 2019, : 680 - 685
  • [6] Experimental analysis of multiple criteria for extractive multi-document text summarization
    Sanchez-Gomez, Jesus M.
    Vega-Rodriguez, Miguel A.
    Perez, Carlos J.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 140
  • [7] Extractive Multi-Document Text Summarization by Using Binary Particle Swarm Optimization
    Potnurwar, Archana
    Pimpalshende, Anjusha
    Aote, Shailendra S.
    Bongirwar, Vrusbali
    [J]. BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (14): : 32 - 34
  • [8] Parallelizing a multi-objective optimization approach for extractive multi-document text summarization
    Sanchez-Gomez, Jesus M.
    Vega-Rodriguez, Miguel A.
    Perez, Carlos J.
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2019, 134 : 166 - 179
  • [9] Extractive Multi-document Text Summarization Leveraging Hybrid Semantic Similarity Measures
    Bandaru, Rajesh
    Radhika, Dr. Y.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (09) : 844 - 852
  • [10] Topic modeling combined with classification technique for extractive multi-document text summarization
    Rajendra Kumar Roul
    [J]. Soft Computing, 2021, 25 : 1113 - 1127