Multi-document extractive text summarization: A comparative assessment on features

被引：23

作者：

Mutlu, Begum ^{[1
]}

Sezer, Ebru A. ^{[2
]}

Akcayol, M. Ali ^{[1
]}

机构：

[1] Gazi Univ, Dept Comp Engn, TR-06570 Ankara, Turkey

[2] Hacettepe Univ, Dept Comp Engn, TR-06800 Ankara, Turkey

来源：

KNOWLEDGE-BASED SYSTEMS | 2019年 / 183卷

关键词：

Multi-document text summarization; Feature space; Sentence scoring-extraction; Artificial neural network; Fuzzy inference system; SENTENCE SCORING TECHNIQUES; FUZZY RULES;

D O I：

10.1016/j.knosys.2019.07.019

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Text summarization is the process of generating a brief version of a text that preserves the salient information of the text. For information retrieval, it is a good dimension reduction solution. In addition, it reduces the required reading time. This study focused on extracting informative summaries from multiple documents using commonly used hand-crafted features from the literature. The first investigation focused on the generation of a feature vector. The features were the number of sentences, term frequency, similarity with the title, term frequency-inverse sentence frequency, sentence position, sentence length, sentence-sentence similarity, bushy-path results, phrases of the sentence, proper nouns, n-gram co-occurrence, and length of the document. Secondly, several combinations of these features were examined and a shallow multi-layer perceptron and two differently modeled fuzzy inference systems were used to extract salient sentences from texts in the Document Understanding Conference (DUC) dataset. The summarization performances of these models were evaluated using original classification performance metrics, and recall-oriented understudy for gisting evaluation (ROUGE)-n. This study recommended the use of fuzzy systems based on a feature vector and a fuzzy rule set for extractive text summarization. The extraction methods were evaluated against a changing compression ratio. Results of experiments showed that the implemented neural model tended to incorrectly infer sentences that were not considered salient by human annotators. However, for distinguishing between summary-worthy and summary-unworthy sentences, the fuzzy inference systems performed better than the utilized neural network, as well as better than the existing fuzzy inference-based text summarization approaches in the literature. (C) 2019 Elsevier B.V. All rights reserved.

引用

页数：13

共 50 条

[1] Survey on Extractive Text Summarization Methods with Multi-Document Datasets
Varalakshmi, P. N. K.
Kallimani, Jagadish S.
[J]. 2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 2113 - 2119
[2] Multi-document extractive text summarization based on firefly algorithm
Tomer, Minakshi
Kumar, Manoj
[J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (08) : 6057 - 6065
[3] Unsupervised extractive multi-document text summarization using a Genetic Algorithm
Neri-Mendoza, Veronica
Ledeneva, Yulia
Garcia-Hernandez, Rene Arnulfo
[J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 2397 - 2408
[4] Extractive multi-document text summarization based on graph independent sets
Uckan, Taner
Karci, Ali
[J]. EGYPTIAN INFORMATICS JOURNAL, 2020, 21 (03) : 145 - 157
[5] Multi-Document Extractive Text Summarization via Deep Learning Approach
Rezaei, Afsaneh
Dami, Sina
Daneshjoo, Parisa
[J]. 2019 IEEE 5TH CONFERENCE ON KNOWLEDGE BASED ENGINEERING AND INNOVATION (KBEI 2019), 2019, : 680 - 685
[6] Experimental analysis of multiple criteria for extractive multi-document text summarization
Sanchez-Gomez, Jesus M.
Vega-Rodriguez, Miguel A.
Perez, Carlos J.
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2020, 140
[7] Extractive Multi-Document Text Summarization by Using Binary Particle Swarm Optimization
Potnurwar, Archana
Pimpalshende, Anjusha
Aote, Shailendra S.
Bongirwar, Vrusbali
[J]. BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (14): : 32 - 34
[8] Parallelizing a multi-objective optimization approach for extractive multi-document text summarization
Sanchez-Gomez, Jesus M.
Vega-Rodriguez, Miguel A.
Perez, Carlos J.
[J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2019, 134 : 166 - 179
[9] Extractive Multi-document Text Summarization Leveraging Hybrid Semantic Similarity Measures
Bandaru, Rajesh
Radhika, Dr. Y.
[J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (09) : 844 - 852
[10] Topic modeling combined with classification technique for extractive multi-document text summarization
Rajendra Kumar Roul
[J]. Soft Computing, 2021, 25 : 1113 - 1127

← 1 2 3 4 5 →